#### Introduction

`ggplot2` is a R package for graphing data based on the “The Grammar of Graphics” framework introduced by Leland Wilkinson. This package is used to construct all of the graphs for the book Visualizing Baseball. The purpose of this document to introduce ggplot2 for a familiar baseball dataset. In this document, I introduce the basic framework and illustrate the use of ggplot2 to construct graphs for different types of variables.

#### Some Baseball Data

Collect hitting data for all teams in the 2015 baseball season. For each team, I compute its slugging percentage `SLG` and its on-base percentage `OBP`.

``````library(dplyr)
library(Lahman)
teams2015 <- filter(Teams, yearID == 2015)
names(teams2015)[18:19] <- c("X2B", "X3B")
teams2015\$SF <- as.numeric(teams2015\$SF)
teams2015\$HBP <- as.numeric(teams2015\$HBP)
teams2015 <- mutate(teams2015,
X1B = H - X2B - X3B - HR,
TB = X1B + 2 * X2B + 3 * X3B + 4 * HR,
SLG = TB / AB,
OBP = (H + BB + HBP) /
(AB + BB + HBP + SF))``````

#### Three Basic Components of a ggplot2 Graph

To construct a graph using `ggplot2`, one needs …

1. A data frame that contains the data that you want to graph.

2. Aesthetics or roles assigned to particular variables in the data frame.

3. A geometric object (or geom for short) which is what you are plotting.

For example, suppose we wish to construct a scatterplot of the on-base percentage and the slugging percentages for all teams in the 2015 season.

1. The data frame `teams2015` contains the data and `OBP` and `SLG` are the variables of interest.

2. To construct a scatterplot, you need to have a variable on the horizontal axis (`x`) and a variable on the vertical axis (`y`). If I want `OBP` to be the horizontal axis variable and `SLG` the vertical axis variable, I would assign the aethetics `OBP` to `x` and `SLG` to `y`.

Steps 1 and 2 are communicated by the command

``````library(ggplot2)
ggplot(data=teams2015, aes(x=OBP, y=SLG))``````