Introduction
This topic considers the problem of comparing two population proportions. Suppose we are interested in comparing two methods of teaching statistics: a traditional approach using lectures and assigned homework problems, and an experimental approach where the students work in small groups in the classroom on directed activities. Suppose that a multiple choice test on basic statistics concepts is constructed. Suppose that 10 out of 20 students from the traditional class pass the exam and 15 out of 20 students from the experimental class pass the exam. Is there sufficient evidence from these data that students taught using the experimental method generally perform better than the students taught using the traditional method on this test?
A Model for Two Proportions
In this problem, we consider two hypothetical populations: the population of students who will take the traditional class (including those who are currently taking the class and those who will take the class in the future), and the population of students who will take the experimental class. We measure the success rates of the two populations by pT and pE, the respective proportions of students from the traditional and experimental groups that would pass the test. We will focus our inference on the difference in proportions, d = pE - pT. We wonder if there is sufficient evidence from the data to conclude that d > 0 (the experimental group performs better). If we can say that d > 0, we are interested in constructing a 90% probability interval for the difference in probabilities.
Discrete Set of Models
In this setting, a model consists of values for the two proportions pE and pT. We can represent a collection of models by a two-way table shown below, where each cell of the table corresponds to a single model. Here we let each proportion take on the nine equally spaced values .1, .2, ..., .9, and so there are 9 x 9 = 81 possible models, or pairs of proportion values.
pT |
||||||||||
pE | .1 | .2 | .3 | .4 | .5 | .6 | .7 | .8 | .9 | |
.1 | ||||||||||
.2 | ||||||||||
.3 | ||||||||||
.4 | ||||||||||
.5 | ||||||||||
.6 | ||||||||||
.7 | ||||||||||
.8 | ||||||||||
.9 |
Uniform Prior
To construct a prior, a probability needs to be assigned to each pair of proportions (pE, pT) that reflects your belief about the likelihood of this pair of proportion values. This is difficult to do, so it is convenient to assign a uniform prior where all models are assumed equally likely. This prior reflects the common situation where you do not have strong beliefs before sampling about the location of either population proportion. In our example, there are 81 possible models, and so the uniform prior would assign a probability of 1/81 to each possible pair of proportions, as shown in the table below.
pT |
||||||||||
.1 | .2 | .3 | .4 | .5 | .6 | .7 | .8 | .9 | ||
pE | .1 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 |
.2 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | |
.3 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | |
.4 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | |
.5 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | |
.6 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | |
.7 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | |
.8 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | |
.9 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 | 1/81 |
Likelihood
Two random samples are taken from the two populations of interest. Suppose we observe sE successes and fE failures from the sample taken from the population with proportion value pE; we observe sT successes and fT failures from the sample taken from the second population. The likelihood of the model pair (pE, pT) is the probability of this sample result assuming these proportion values:
Inference
We compute the posterior model probabilities using our basic Bayes' rule recipe:
for each model (pair of proportions), we multiply the prior probability by the corresponding likelihood
we sum all of the products
we divide each likelihood by the sum to get the corresponding posterior probability
This calculation is best left to a computer program such as Minitab. In our example, using the data values (sE, fE, sT, fT) = (15, 5, 10, 10), we obtain the following table of posterior probabilities:
pT |
||||||||||
.1 | .2 | .3 | .4 | .5 | .6 | .7 | .8 | .9 | ||
pE | .1 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 |
.2 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | |
.3 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | |
.4 | .000 | .000 | .000 | .001 | .001 | .001 | .000 | .000 | .000 | |
.5 | .000 | .000 | .002 | .008 | .011 | .008 | .002 | .000 | .000 | |
.6 | .000 | .001 | .010 | .039 | .058 | .039 | .010 | .001 | .000 | |
.7 | .000 | .002 | .024 | .092 | .139 | .092 | .024 | .002 | .000 | |
.8 | .000 | .002 | .024 | .090 | .136 | .090 | .024 | .002 | .000 | |
.9 | .000 | .000 | .004 | .016 | .025 | .016 | .004 | .000 | .000 |
From this posterior distribution, we can compute posterior probabilities for the difference in proportions d = pE - pT. For example, the posterior probability that d = 0 is the sum of the diagonal elements of the table (shaded above) where the proportions are equal:
Prob(d = 0) = Prob(pT = pE) = .001 + .011 + .039 + .024 + .002 = .076.
Similarly, we can find the probabilities for each possible value of d -- the posterior distribution is given in the table below.
d=pE-pT PROBABILITY ------------------- -0.8000 0.0000 -0.7000 0.0000 -0.6000 0.0000 -0.5000 0.0000 -0.4000 0.0000 -0.3000 0.0003 -0.2000 0.0034 -0.1000 0.0204 0 0.0766 0.1000 0.1823 0.2000 0.2741 0.3000 0.2548 0.4000 0.1400 0.5000 0.0418 0.6000 0.0059 0.7000 0.0003 0.8000 0.0000
Recall that we were interested in the probability that the experimental group is superior to the traditional group on the test -- that is, pE > pT or d > 0. Using this table
Prob(d > 0) = P(d = .1, .2, .3, .4, .5, .6, .7) = .8992.
Using Simulation to Summarize a Beta Curve
Before we introduce continuous models for two proportions, we describe a simulation method of summarizing a beta probability curve. Suppose we are interested in learning about the proportion of students p who favor a new dress policy at your school. If we represent our prior beliefs about p by means of a flat prior, and we observe 10 students in favor of the policy and 5 against, then the posterior curve for the proportion is given by
which we recognize as being a beta curve with numbers 11 and 6.
One convenient way of summarizing this beta curve uses simulation. Using Minitab, we simulate a random sample of size 1000 from this beta probability curve -- the simulated values are graphed using the dotplot below.
:: .:::.: ::::::: :: ::::::::: .::::::::::::: .:.::::::::::::::::. ..::::::::::::::::::::. .. ..::::::::::::::::::::::::::... +---------+---------+---------+---------+---------+-p 0.00 0.20 0.40 0.60 0.80 1.00
We can summarize this probability distribution by computing appropriate summaries from this simulated dataset. For example, the mean of this probability curve is approximated by the sample mean of these simulated values:
Mean = .646
The probability that p is larger than .5 can be approximated by the proportion of simulated values that exceed 0.
Prob(p > 0) = .889 .
Continuous Models for Two Proportions
As in the single proportion case, it is a bit unrealistic to assume that each proportion can take on only particular discrete values. It is more realistic to assume that each proportion is continuous-valued on the interval (0, 1). In our example, we assume that each proportion is continuous. Then our models are represented by all points in the unit square where 0 < pE < 1 and 0 < pT < 1.
Inference Using a Uniform Prior and Beta Curves
In the situation where little prior information exists about either proportion, it is convenient to assign a uniform distribution on the unit square
PRIOR = 1, 0 < pE < 1, 0 < pT < 1.
Then, using the likelihood given above, the posterior curve has the form
Looking at this expression, we see that the proportions pE and pT have independent posterior curves, with pE distributed beta(sE+1, fE+1) and pT distributed beta(sT+1, fT+1). In our example, with sE = 15, fE = 5, sT = 10, fT = 10, pE would be beta(16, 6) and pT would be beta(11, 11).
Using Simulation to Perform Inference
In the case of using continuous models for two proportions, simulation gives a convenient way of summarizing the posterior distribution and in finding probabilities for the difference in two proportions. In our example, we simulate one value from the posterior distribution of (pE, pT) by
simulating pE from beta(16, 6)
simulating pT from beta(11, 11)
On Minitab, I simulate pE = 0.6175, pT = .3916. The difference in simulated proportions, d = .6175 - .3916, represents a simulated value from the posterior curve of d = pE - pT.
The graph below plots 1000 simulated values of the two
proportions on a scatterplot.
By computing summaries from this set of simulated values, we can learn about the locations of the two proportions. The line pE = pT is drawn on the graph. The proportion of simulated proportions above the line is an estimate at Prob(pE > pT) = Prob(d > 0).
We can also summarize the simulated sample of values of d = pE - pT to compare the two proportions. Below we give a grouped count table of the 1000 values of d. To interpret this table, note that 234 values of d fall between .1 and 2. So Prob(.1 < d < .2) = 234/1000 = .234. To see if d > 0 , we compute the probability
P(d > 0) = .938.
This probability is approximately equal to the posterior probability that we found using discrete models for the proportions.
d=pE-pT
LO HI count ------------------ -1 -0.9 0 -0.9 -0.8 0 -0.8 -0.7 0 -0.7 -0.6 0 -0.6 -0.5 0 -0.5 -0.4 0 -0.4 -0.3 0 -0.3 -0.2 1 -0.2 -0.1 8 -0.1 0 53 -0 0.1 139 0.1 0.2 234 0.2 0.3 282 0.3 0.4 205 0.4 0.5 57 0.5 0.6 21 0.6 0.7 0 0.7 0.8 0 0.8 0.9 0 0.9 1 0
Page Author: Jim Albert (c)
albert@math.bgsu.edu