|
A Probability/Activity Approach for Teaching Introductory Statistics Jim Albert |
|
||||||
Workshop
Statistics: Discovery with Data, A
Bayesian Approach, Key College Press; ISBN: 1930190123
(coauthored with Allan J. Rossman of Dickinson College)
is a collection of classroom and homework activities designed
to introduce the student to concepts in data analysis, probability,
and statistical inference.
Students work toward learning these concepts through the
analysis of genuine data and hands-on probability experiments
and through interaction with one another, with their instructor,
and with technology. Providing
a one-semester introduction to fundamental ideas in statistics
for college and advanced high school students, this text is designed
for courses that employ an interactive learning environment by
replacing lectures with hands-on activities.
The complete book can be downloaded from https://bayesball.github.io/nsf_web/workshop.bayes.pdf
This text is distinctive with respect to two aspects --- its emphasis on active learning and its use of the Bayesian viewpoint to introduce the basic notions of statistical inference.
PREFACE AND RATIONALE |
OUTLINE AND RESOURCES |
This
text is written for use with the workshop pedagogical approach,
which fosters active learning by minimizing lectures and eliminating
the conventional distinction between laboratory and lecture sessions. The book's activities require students to collect
data and perform random experiments, make predictions, read about
studies, analyze data, discuss findings, and write explanations.
The instructor's responsibilities in this setting are to
check students' progress, ask and answer questions, lead class
discussions, and deliver ``mini-lectures" where appropriate.
The essential point is that every student is actively engaged
with learning the material through reading, thinking, discussing,
computing, interpreting, writing, and reflecting.
In this manner students construct their own knowledge of
probability and statistical ideas as they work through the activities.
Statistical
inference is traditionally taught using the frequentist approach.
To illustrate this approach, suppose one is interested
in learning about the proportion of all undergraduate students
at a particular college who drink coffee regularly.
One learns about this unknown proportion by taking a random
sample of 100 students, asking each student if they drink coffee
regularly, and computing the proportion of students who drink
coffee in the sample. Suppose
that this sample proportion is 21/100 = .21.
Based on this data, what have we learned about the proportion
of all undergraduates who drink coffee?
The
frequentist approach to inference is based on the concept of a
sampling distribution. Suppose
that one is able to take samples of size 100 repeatedly from the
population of undergraduates and compute the sample proportion
for each sample selected.
The collection of proportions from all of the samples is
called the sampling distribution of the proportion.
The knowledge of the shape, mean, and standard deviation
of this sampling distribution is used to construct confidence
intervals for the proportion of interest, and to make decisions
about the location of the proportion.
To
correctly interpret traditional inferential procedures, students
need to understand the notion of a sampling distribution. The students will analyze only one sample in
their data analysis. But
he or she has to think what could happen if we took a large number
of random samples (like the one just selected) from the population.
Indeed, a statement such as ``95% confidence" for
an interval estimate refers to the behavior of the statistical
procedure when samples are repeatedly taken from the population.
The
Bayesian viewpoint toward inference is based on the subjective
interpretation of probability.
In our example, the proportion of undergraduates drinking
coffee is an unknown quantity, and a probability distribution
is used to represent a person's belief about the location of this
proportion. This
probability distribution, called the prior, reflects a person's
knowledge about the proportion before any data is collected.
After the sample survey is taken and data are observed,
then one's opinions about the proportion will change.
Bayes' rule is the recipe for computing the new probability
distribution for the proportion, called the posterior, based on
knowledge of the prior probability distribution and the sample
survey data. All inferences about the proportion are made by computing appropriate summaries
of the posterior probability distribution.
To
use Bayes' rule in an introductory statistics class, the student
needs to learn some basic probability concepts.
Topics 11, 12, 13 of the text discuss the interpretation
of probabilities and methods of computing and interpreting probability
tables. Conditional
probability is introduced in Topic 14 by means of a two-way probability
table, and this two-way table is used in Topic 15 to introduce
Bayes' rule. The
basic methodology used in Topic 15 is extended to problems of
inference for one proportion, one mean, and two proportions
in Topics 16, 17, 18, 19, and 21.
Although
the traditional approach to teaching statistical inference leads
to the familiar methods used in practice, it can be difficult
to learn in a first course.
The notion of a sampling distribution is perhaps the most
difficult concept, since the student is asked to think about the
variation in samples other than the one that he or she observed.
If the student does not fully understand the repeated sampling
idea that is inherent in a sampling distribution, then he or she
will not be able to correctly interpret traditional inferential
conclusions like ``I am 95% confident that the proportion is contained
in my interval."