A Bayesian Primer

Topic 21:  Learning About Two Proportions

Introduction

This topic considers the problem of comparing two population proportions.  Suppose we are interested in comparing two methods of teaching statistics:  a traditional approach using lectures and assigned homework problems, and an experimental approach where the students work in small groups in the classroom on directed activities.  Suppose that a multiple choice test on basic statistics concepts is constructed.  Suppose that 10 out of 20 students from the traditional class pass the exam and 15 out of 20 students from the experimental class pass the exam.  Is there sufficient evidence from these data that students taught using the experimental method generally perform better than the students taught using the traditional method on this test? 

A Model for Two Proportions

In this problem, we consider two hypothetical populations:  the population of students who will take the traditional class (including those who are currently taking the class and those who will take the class in the future), and the population of students who will take the experimental class.  We measure the success rates of the two populations by pT and pE, the respective proportions of students from the traditional and experimental groups that would pass the test.  We will focus our inference on the difference in proportions, d = p- pT.  We wonder if there is sufficient evidence from the data to conclude that d > 0 (the experimental group performs better).   If we can say that d > 0, we are interested in constructing a 90% probability interval for the difference in probabilities.

Discrete Set of Models

In this setting, a model consists of values for the two proportions pE and  pT.  We can represent a collection of models by a two-way table shown below, where each cell of the table corresponds to a single model.  Here we let each proportion take on the nine equally spaced values .1, .2, ..., .9, and so there are 9 x 9 = 81 possible models, or pairs of proportion values.  

   

pT

pE   .1 .2 .3 .4 .5 .6 .7 .8 .9
.1                                                                    
.2                  
.3                  
.4                  
.5                  
.6                  
.7                  
.8                  
.9                  

Uniform Prior

To construct a prior, a probability needs to be assigned to each pair of proportions (pE,  pT) that reflects your belief about the likelihood of this pair of proportion values.  This is difficult to do, so it is convenient to assign a uniform prior where all models are assumed equally likely.  This prior reflects the common situation where you do not have strong beliefs before sampling about the location of either population proportion.  In our example, there are 81 possible models, and so the uniform prior would assign a probability of 1/81 to each possible pair of proportions, as shown in the table below.

   

pT

    .1 .2 .3 .4 .5 .6 .7 .8 .9
pE .1 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.2 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.3 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.4 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.5 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.6 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.7 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.8 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81
.9 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81 1/81

Likelihood

Two random samples are taken from the two populations of  interest.  Suppose we observe sE successes and fE failures from the sample taken from the population with proportion value pE; we observe sT successes and fT failures from the sample taken from the second population.  The likelihood of the model pair (pE, pT) is the probability of this sample result assuming these proportion values:

Inference

We compute the posterior model probabilities using our basic Bayes' rule recipe:

This calculation is best left to a computer program such as Minitab.  In our example, using the data values (sE, fE, sT, fT) = (15, 5, 10, 10), we obtain the following table of posterior probabilities:

   

pT

    .1 .2 .3 .4 .5 .6 .7 .8 .9
pE .1 .000 .000 .000 .000 .000 .000 .000 .000 .000
.2 .000 .000 .000 .000 .000 .000 .000 .000 .000
.3 .000 .000 .000 .000 .000 .000 .000 .000 .000
.4 .000 .000 .000 .001 .001 .001 .000 .000 .000
.5 .000 .000 .002 .008 .011 .008 .002 .000 .000
.6 .000 .001 .010 .039 .058 .039 .010 .001 .000
.7 .000 .002 .024 .092 .139 .092 .024 .002 .000
.8 .000 .002 .024 .090 .136 .090 .024 .002 .000
.9 .000 .000 .004 .016 .025 .016 .004 .000 .000

From this posterior distribution, we can compute posterior probabilities for the difference in proportions d = pE - pT.  For example, the posterior probability that d = 0 is the sum of the diagonal elements of the table (shaded above) where the proportions are equal:

Prob(d = 0) = Prob(pT = pE) = .001 + .011 + .039 + .024 + .002 = .076.

Similarly, we can find the probabilities for each possible value of d -- the posterior distribution is given in the table below.

   d=pE-pT  PROBABILITY
   -------------------
   -0.8000    0.0000
   -0.7000    0.0000
   -0.6000    0.0000
   -0.5000    0.0000
   -0.4000    0.0000
   -0.3000    0.0003
   -0.2000    0.0034
   -0.1000    0.0204
         0    0.0766
    0.1000    0.1823
    0.2000    0.2741
    0.3000    0.2548
    0.4000    0.1400
    0.5000    0.0418
    0.6000    0.0059
    0.7000    0.0003
    0.8000    0.0000
    

Recall that we were interested in the probability that the experimental group is superior to the traditional group on the test -- that is, pE > pT or d > 0.  Using this table

Prob(d > 0) = P(d = .1, .2, .3, .4, .5, .6, .7) = .8992.

Using Simulation to Summarize a Beta Curve

Before we introduce continuous models for two proportions, we describe a simulation method of summarizing a beta probability curve.  Suppose we are interested in learning about the proportion of students p who favor a new dress policy at your school.  If we represent our prior beliefs about p by means of a flat prior, and we observe 10 students in favor of the policy and 5 against, then the posterior curve for the proportion is given by

which we recognize as being a beta curve with numbers 11 and 6.

One convenient way of summarizing this beta curve uses simulation.  Using Minitab, we simulate a random sample of size 1000 from this beta probability curve -- the simulated values are graphed using the dotplot below.

                                          ::
                                        .:::.:
                                       :::::::
                                    :: :::::::::
                                   .:::::::::::::
                                .:.::::::::::::::::.
                              ..::::::::::::::::::::.
                      ..  ..::::::::::::::::::::::::::...
         +---------+---------+---------+---------+---------+-p      
      0.00      0.20      0.40      0.60      0.80      1.00
 

We can summarize this probability distribution by computing appropriate summaries from this simulated dataset.  For example, the mean of this probability curve is approximated by the sample mean of these simulated values:

Mean = .646

The probability that p is larger than .5 can be approximated by the proportion of simulated values that exceed 0.

Prob(p > 0) = .889 .

Continuous Models for Two Proportions

As in the single proportion case, it is a bit unrealistic to assume that each proportion can take on only particular discrete values.  It is more realistic to assume that each proportion is continuous-valued on the interval (0, 1).  In our example, we assume that each proportion is continuous.  Then our models are represented by all points in the unit square where 0 < pE < 1 and 0 < pT < 1.

Inference Using a Uniform Prior and Beta Curves

In the situation where little prior information exists about either proportion, it is convenient to assign a uniform distribution on the unit square

PRIOR = 1, 0 < pE < 1, 0 < pT < 1.

Then, using the likelihood given above, the posterior curve has the form

Looking at this expression, we see that the proportions pE and pT have independent posterior curves, with pE distributed beta(sE+1, fE+1) and pT distributed beta(sT+1, fT+1).  In our example, with sE = 15, fE = 5, sT = 10, fT = 10,  pE would be beta(16, 6) and pT would be beta(11, 11).

Using Simulation to Perform Inference

In the case of using continuous models for two proportions, simulation gives a convenient way of summarizing the posterior distribution and in finding probabilities for the difference in two proportions.  In our example, we simulate one value from the posterior distribution of (pE, pT) by

On Minitab, I simulate pE = 0.6175, pT = .3916.  The difference in simulated proportions, d = .6175 - .3916, represents a simulated value from the posterior curve of d = pE - pT.

The graph below plots 1000 simulated values of the two proportions on a scatterplot.

By computing summaries from this set of simulated values, we can learn about the locations of the two proportions.  The line pE = pT is drawn on the graph.  The proportion of simulated proportions above the line is an estimate at Prob(pE > pT) = Prob(d > 0).

We can also summarize the simulated sample of values of d = pE - pT to compare the two proportions.  Below we give a grouped count table of the 1000  values of d.  To interpret this table, note that 234 values of d fall between .1 and 2.  So Prob(.1 < d < .2) = 234/1000 = .234.  To see if d > 0 , we compute the probability

P(d > 0) = .938.

This probability is approximately equal to the posterior probability that we found using discrete models for the proportions.

        d=pE-pT
    LO	  HI     count
    ------------------
   -1     -0.9      0
   -0.9   -0.8      0
   -0.8   -0.7      0
   -0.7   -0.6      0
   -0.6   -0.5      0
   -0.5   -0.4      0
   -0.4   -0.3      0
   -0.3   -0.2      1
   -0.2   -0.1      8
   -0.1    0       53
   -0      0.1    139
    0.1    0.2    234
    0.2    0.3    282
    0.3    0.4    205
    0.4    0.5     57
    0.5    0.6     21
    0.6    0.7      0
    0.7    0.8      0
    0.8    0.9      0
    0.9    1        0 

Page Author: Jim Albert (c) 
albert@math.bgsu.edu