 A Probability/Activity Approach for Teaching Introductory Statistics Jim Albert Department of Mathematics and Statistics Bowling Green State University           Workshop Statistics:  Discovery with Data, A Bayesian Approach, Key College Press; ISBN: 1930190123 (coauthored with Allan J. Rossman of Dickinson College) is a collection of classroom and homework activities designed to introduce the student to concepts in data analysis, probability, and statistical inference.  Students work toward learning these concepts through the analysis of genuine data and hands-on probability experiments and through interaction with one another, with their instructor, and with technology.  Providing a one-semester introduction to fundamental ideas in statistics for college and advanced high school students, this text is designed for courses that employ an interactive learning environment by replacing lectures with hands-on activities.

This text is distinctive with respect to two aspects --- its emphasis on active learning and its use of the Bayesian viewpoint to introduce the basic notions of statistical inference.

#### Active Learning

This text is written for use with the workshop pedagogical approach, which fosters active learning by minimizing lectures and eliminating the conventional distinction between laboratory and lecture sessions.  The book's activities require students to collect data and perform random experiments, make predictions, read about studies, analyze data, discuss findings, and write explanations.  The instructor's responsibilities in this setting are to check students' progress, ask and answer questions, lead class discussions, and deliver ``mini-lectures" where appropriate.  The essential point is that every student is actively engaged with learning the material through reading, thinking, discussing, computing, interpreting, writing, and reflecting.  In this manner students construct their own knowledge of probability and statistical ideas as they work through the activities.

#### The Traditional Approach to Teaching Statistical Inference

Statistical inference is traditionally taught using the frequentist approach.  To illustrate this approach, suppose one is interested in learning about the proportion of all undergraduate students at a particular college who drink coffee regularly.  One learns about this unknown proportion by taking a random sample of 100 students, asking each student if they drink coffee regularly, and computing the proportion of students who drink coffee in the sample.  Suppose that this sample proportion is 21/100 = .21.  Based on this data, what have we learned about the proportion of all undergraduates who drink coffee?

The frequentist approach to inference is based on the concept of a sampling distribution.  Suppose that one is able to take samples of size 100 repeatedly from the population of undergraduates and compute the sample proportion for each sample selected.  The collection of proportions from all of the samples is called the sampling distribution of the proportion.  The knowledge of the shape, mean, and standard deviation of this sampling distribution is used to construct confidence intervals for the proportion of interest, and to make decisions about the location of the proportion.

To correctly interpret traditional inferential procedures, students need to understand the notion of a sampling distribution.  The students will analyze only one sample in their data analysis.  But he or she has to think what could happen if we took a large number of random samples (like the one just selected) from the population.  Indeed, a statement such as ``95% confidence" for an interval estimate refers to the behavior of the statistical procedure when samples are repeatedly taken from the population.

#### The Bayesian Viewpoint

The Bayesian viewpoint toward inference is based on the subjective interpretation of probability.  In our example, the proportion of undergraduates drinking coffee is an unknown quantity, and a probability distribution is used to represent a person's belief about the location of this proportion.  This probability distribution, called the prior, reflects a person's knowledge about the proportion before any data is collected.  After the sample survey is taken and data are observed, then one's opinions about the proportion will change.  Bayes' rule is the recipe for computing the new probability distribution for the proportion, called the posterior, based on knowledge of the prior probability distribution and the sample survey data.  All inferences about the proportion  are made by computing appropriate summaries of the posterior probability distribution.

To use Bayes' rule in an introductory statistics class, the student needs to learn some basic probability concepts.  Topics 11, 12, 13 of the text discuss the interpretation of probabilities and methods of computing and interpreting probability tables.  Conditional probability is introduced in Topic 14 by means of a two-way probability table, and this two-way table is used in Topic 15 to introduce Bayes' rule.  The basic methodology used in Topic 15 is extended to problems of inference for  one proportion, one mean, and two proportions in Topics 16, 17, 18, 19, and 21.

#### Why Not Teach the Traditional Approach?

Although the traditional approach to teaching statistical inference leads to the familiar methods used in practice, it can be difficult to learn in a first course.  The notion of a sampling distribution is perhaps the most difficult concept, since the student is asked to think about the variation in samples other than the one that he or she observed.    If the student does not fully understand the repeated sampling idea that is inherent in a sampling distribution, then he or she will not be able to correctly interpret traditional inferential conclusions like ``I am 95%  confident that the proportion is contained in my interval."   Since the traditional inferential concepts are hard to learn, the instructor may focus on teaching the mechanics of statistical inference instead of the concepts.  These mechanics include the use of a variety of  statistical recipes and the correct programming of these recipes using a statistics computer package.  This type of ``cookbook" class is counter to the modern movement in statistics instruction which encourages more thinking in a first class and fewer recipes.

#### Why Teach Bayes?

• Conditional inference.  All Bayesian inferential conclusions are made conditional on the observed data.  Unlike the traditional approach, one need not be concerned with datasets other than the one that is observed.  There is no need to discuss sampling distributions using the Bayesian approach.
• Inferential conclusions are understandable. From a Bayesian viewpoint, it is legitimate to talk about the probability that the proportion falls in a specific interval, say (.1, .3), or the probability that a hypothesis is true.  These inferential conclusions are generally the ones that seem intuitive for the students.  In contrast, traditional inferential conclusions are frequently misstated.  For example, if a computed traditional confidence interval is (.1, .3), it is common for the student to  incorrectly state that the  proportion falls in the interval (.1, .3) with probability .90.  The students forget that the probability (in a traditional viewpoint) refers to the behavior of the interval estimate under repeated sampling.
• One recipe.  Bayes'  rule is really the only inferential method that needs to be taught.  Once Bayes' rule is used to compute the posterior distribution, the student just needs to summarize this probability distribution to make inferences.
• Illustrates the use of the scientific method. One goal of an introductory statistics class is to learn how statisticians use the scientific method to answer questions.  In the scientific method, one begins with a hypothesis or theory about some event, data is collected relevant to the problem, and then the theory is revised according to the data results.  The Bayesian viewpoint provides a convenient paradigm for implementing the scientific method.  The prior probability distribution can be used to state initial beliefs about the population of interest, relevant sample data is collected, and the posterior probability distribution reflects one's new beliefs about the population in light of the data that were collected.