Chapter 8 Multilevel Modeling of Means

8.1 Packages for example

library(ProbBayes)
library(tidyverse)
library(brms)

8.2 Movie Ratings Study

Table 10.1 gives summaries of the ratings for eight different animation movies. The table includes the number of ratings, the mean and the standard deviation of the ratings. The data is contained in the data frame animation_ratings in the ProbBayes package.

8.3 The Multilevel Model

Sampling

Let \(y_{ij}\) denote the rating of the \(i\)th individual for the \(j\)th movie.

We assume that \(y_{ij} \sim N(\mu_j, \sigma)\).

Prior

The parameters \(\mu_1, ..., \mu_8\) represent the mean ratings for the eight movies. Write \[ \mu_j = \beta + \gamma_j \]

The intercept parameter \(\beta\) has a student t distribution with mean 4, scale parameter 2.5, and 3 degrees of freedom.
We assume the effect parameters \(\gamma_1, ..., \gamma_8\) have a normal distribution with mean 0 and standard deviation \(\tau\).
There are two standard deviations, the sampling standard deviation \(\sigma\) and the between-means standard deviation \(\tau\). Each of these standard deviations are given weakly informative student t distributions with mean 0, scale 2.5 and 3 degrees of freedom.

8.4 Bayesian Fitting

The model is fit by use of the brm() function. By default, this function assumes a Gaussian (normal) sampling distribution. The “(1 | movieID)” argument indicates that the \(\mu_1, ..., \mu_8\) have a random distribution.

fit <- brm(rating ~ (1 | movieId),
           data = animation_ratings,
           refresh = 0)

## Compiling Stan program...

## Start sampling

## Warning: There were 6 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup

## Warning: Examine the pairs() plot to diagnose sampling problems

One can check the default priors by use of the prior_summary() function.

prior_summary(fit)

##                  prior     class      coef   group resp dpar nlpar bound
## 1 student_t(3, 4, 2.5) Intercept                                        
## 2 student_t(3, 0, 2.5)        sd                                        
## 3                             sd           movieId                      
## 4                             sd Intercept movieId                      
## 5 student_t(3, 0, 2.5)     sigma

The posterior matrix of simulated draws is available by use of the posterior_samples() function. Below I construct density estimates of the two standard deviation parameters \(\sigma\) (blue) and \(\tau\) (red).

ggplot(posterior_samples(fit),
       aes(sigma)) +
  geom_density(color = "blue") +
  geom_density(aes(sd_movieId__Intercept),
               color = "red")

To show the posterior distributions of the means, I reshape the matrix of simulated draws by use of the pivot_longer() function.

posterior_samples(fit) %>% 
  pivot_longer('r_movieId[76093,Intercept]':'r_movieId[81847,Intercept]',
               names_to = "Movie",
               values_to = "Effect") -> post

Remember that we represented the movie ratings mean as \(\mu_j = \beta + \gamma_j\). Below are parallel boxplots of the posterior distributions of \(\mu_1, ..., \mu_8\).

ggplot(post, aes(Movie, Effect + b_Intercept)) +
  geom_boxplot() +
  coord_flip()