### Bayesian statistics

(statistics method assigning probability to your belief)

Bayesian statistics is a statistical approach that deals with Bayesian probabilities, probabilities representing your degree of belief that something is fact. This approach to statistics is not new, but through much of the 20th century was not the typical method in use. One reason for the revival is the world's ever-growing number-crunching capacity, as Bayesian methods can require a lot of computation. With its renaissance, the term frequentist statistics has been used to refer to the more common 20th-century approach.

An example of the frequentist approach to find out how many of a city's voters plan to vote for candidate A by choosing a random sample of voters and finding out their plans, then adopting the result as an estimate of the entire city's voter population's plans, along with an indication of your confidence based upon your sample size. This confidence indication typically indicates how likely the actual number is to be within some interval encompassing your calculated estimate, e.g., "99% chance that it is within a percent of the actual number."

The Bayesian approach applies the probability rule, Bayes' theorem (aka Bayes' law or Bayes' rule) to a Bayesian probability. Bayes' theorem is a general mathematical probability theorem not confined to Bayesian statistics, valid in any statistics method, but in the case of Bayesian statistics, it is exactly what is needed to update a Bayesian probability with information provided by sample data.

Using the Bayesian approach, if you have an idea of how many plan to vote for candidate A, and check with some voters, the theorem can be used to factor the new evidence into your opinion. For example, if you earlier believed there is a 90% chance candidate A would receive a majority and polling three random people reveals all three will vote for candidate B, this naturally affects your opinion, and the theorem provides the means to calculate an exact, rational adjustment. Three is a very small sample, yet applying Bayes' theorem to the Bayesian probability yields a sound result. This adjustment-by-calculation is termed Bayesian inference. In many straight forward cases with sufficient data, this can yield a conclusion virtually matching that of the frequentist approach, but it can also be applied in situations where the frequentist method cannot, and it produces appropriate conclusions even when based on very small amounts of data. It also has the benefit of drawing correct conclusions from some kinds of disparate evidence, mixing apples and oranges, so to speak. Like any statistics method, its conclusions depend upon careful regard for what aspects of the data are truly independent.

Some reluctance to use the Bayesian approach stems from its unfamiliarity (statistics textbooks for many years didn't even mention it), the non-intuitive conclusions the theorem at times produces (though correct ones, assuming correct interpretation of the input data), the fact that it begins with an opinion (the initial Bayesian probability), which clearly affects the result and would seem to preclude objectivity in the conclusions, and that the calculations of "probabilities of probabilities" (either of which could be a sizable probability distribution) often lead to math equations that cannot be solved analytically, and very often not even through the usual numerical methods. Its revival, in addition to stemming from the world's increasing raw computational capacity, is due both to the additional conclusions that Bayesian statistics can draw (e.g., given the evidence, exactly how likely some particular distribution is to be correct), and the availability of new computational techniques that can handle the difficult math problems, in particular, Markov chain Monte Carlo (MCMC).

Bayesian statistics has its own jargon: the Bayesian probabilities of a given application are spoken of as prior or posterior, for that which represents your opinion before your "study", and that which represents your opinion after factoring in the study's data. Thus, phrases like:

• prior probability, prior distribution, prior density, prior mass
• posterior probability, posterior distribution, posterior density, posterior mass

­correspond to the probability, the probability distribution, the probability density function (PDF), and/or the probability mass function (PMF) that are the input to and output of the calculation (the Bayesian inference). (The terms prior and posterior may be seen "alone", referring to a distribution or a probability mass, according to the context.) A Bayes factor is the relative probability of two competing models (perhaps one being the null hypothesis). If the models produce distributions, it is the ratio of their respective integrals.

When embarking upon analysis, the necessity for a prior presents a challenge: you need to figure out what to use, which can influence the result. You may have to devise the prior, perhaps characterizing the answer to the question "what do we know now". I'm guessing priors are often the results of a preliminary or past frequentist study. To establish that a Bayesian result is robust, often multiple priors are tried specifically to see if the result is dependent upon the selection from among reasonable priors.

(mathematics,statistics,probability)