Load in a few helpful packages.
library(readr)
library(ggplot2)
library(dplyr)
Using the Retrosheet play-by-play data for the 2015 season, I found the expected runs in the remainder of the inning for plate appearances that pass through each possible count. I store these expected runs values in the csv file “count2015a.csv”.
I read this file into R – variable name of data frame is d
– and show the first few lines.
d <- read_csv("https://bayesball.github.io/VB/data/count2015a.csv")
head(d)
## # A tibble: 6 x 6
## count strikes balls N.Pitches Type Runs
## <chr> <int> <int> <int> <chr> <dbl>
## 1 0-0 0 0 0 Neutral -0.000798051
## 2 1-0 0 1 1 Batter 0.033855787
## 3 0-1 1 0 1 Pitcher -0.038708814
## 4 2-0 0 2 2 Batter 0.094016898
## 5 1-1 1 1 2 Pitcher -0.015252581
## 6 0-2 2 0 2 Pitcher -0.089381123
In this graph, the Pitch Number (variable N.Pitches
) is graphed against the Runs Value (variable Runs
), using the Count (variable count
) as the plotting label.
ggplot(d, aes(N.Pitches, Runs, label=count)) +
geom_point() +
geom_path(data=filter(d, strikes==0),
aes(N.Pitches, Runs), color="blue") +
geom_path(data=filter(d, strikes==1),
aes(N.Pitches, Runs), color="blue") +
geom_path(data=filter(d, strikes==2),
aes(N.Pitches, Runs), color="blue") +
geom_path(data=filter(d, balls==0),
aes(N.Pitches, Runs), color="blue") +
geom_path(data=filter(d, balls==1),
aes(N.Pitches, Runs), color="blue") +
geom_path(data=filter(d, balls==2),
aes(N.Pitches, Runs), color="blue") +
geom_path(data=filter(d, balls==3),
aes(N.Pitches, Runs), color="blue") +
xlab("Pitch Number") +
ylab("Runs Value") +
ggtitle("") +
geom_hline(yintercept=0, color="black") +
geom_label()
Above we considered the runs value of plate appearances that pass through each possible count. Here we consider the runs values of balls put in play on each possible count. These runs values are found using 2016 Retrosheet play-by-play data. The data is saved in the csv file “count2015b.csv”. We read in this data and save in the variable S
.
S <- read_csv("https://bayesball.github.io/VB/data/count2015b.csv")
head(S)
## # A tibble: 6 x 6
## count Runs strikes balls N.Pitches N
## <chr> <dbl> <int> <int> <int> <int>
## 1 0-0 0.04020994 0 0 0 20668
## 2 0-1 0.01629339 1 0 1 16560
## 3 0-2 0.01622481 2 0 2 8374
## 4 1-0 0.05057780 0 1 1 12366
## 5 1-1 0.03688532 1 1 2 15601
## 6 1-2 0.01841516 2 1 3 14508
In this graph, the Pitch Number (variable N.Pitches
) is graphed against the Runs Value (variable Runs
), using the Count (variable count
) as the plotting label.
ggplot(S, aes(N.Pitches, Runs, label=count, size=N)) +
xlab("Number of Pitch") +
ylab("Runs Value") +
geom_hline(yintercept=0, color="black") +
geom_label()