# Developing wOBA for the Kernel Collegiate League

By Owen Crandall

**What is wOBA?**

Weighted On-Base Average was created by statistician and baseball researcher Tom Tango, and is described in his work, *The Book: Playing The Percentages In Baseball. *wOBA was designed to measure offensive performance by combining the ideas behind on-base (OBP) and slugging percentage (SLG) without their limitations. OBP tells you how often a player reaches base, but not how far they reached. SLG does tell you how far they reached, but only considers hits and not other methods of reaching base. OPS (on-base plus slugging) was created to account for these issues by combining the two stats. It is very helpful for player evaluation, but, according to Tango, it is more effective to build a statistic from scratch than use one built from flawed metrics.

The basis of wOBA comes from the run values of offensive events. This is the idea that each offensive event (whether it be a single, home run, hit by pitch etc.) has an expected amount of runs that it produces. Some events are more valuable than others, i.e. a home run usually produces more runs than a walk. These values are taken from run expectancy matrices, which compute the probability of scoring a run given different inning states (i.e. how many runners are on base and how many outs there are). The process of deriving the weights will be discussed in detail in the next section. In a typical Major League run environment, league average wOBA tends to be roughly .320, with the best hitters at .370 and above and the worst at .300 and below, according to Fangraphs.

Given the high-scoring nature of the KCL, I thought it would be interesting to develop wOBA for it, as it will hopefully provide a more insightful look into the run producing value of each player.

**The Process**

In addition to the original formula for wOBA published in his book, Tom Tango published a SQL script that allows you to adjust the weights in the formula based on the run environment of the league you are examining. As there are only 4 teams in the KCL compared to 30 Major League teams, it made more sense to download the files containing player statistics and access them using R, rather than setting up a database and accessing it using SQL. Therefore, the SQL script published by Tango had to be adapted for R. The R script containing the processes described in the rest of this article is attached below.

After loading in and formatting the data, the first step was creating the run values of each event. The base number used in creating run values is runs per out, which determines the run environments of various leagues and seasons. For example, a league with more runs per out would have a richer run environment than a league with less. This was calculated by adapting innings pitched to outs and using that to divide the total runs scored. At the time of this writing, the R/Outs in the KCL is 0.286, telling us that an average of 0.286 runs is scored for every out made.

Now using the R/Outs, we can build the run values. Using the formula provided by Tango, the values build on each other, starting with walks being worth 0.14 runs greater than R/Outs. The formulation for the rest of this step can be seen in the table below.

Offensive Event | Formula |

Walks | R/Outs + 0.14 |

Hit by Pitches | BB RV* + 0.025 |

Singles | BB RV + 0.155 |

Doubles | 1B RV + 0.3 |

Triples | 2B RV + 0.27 |

Home Runs | 1.4 |

Stolen Bases | 0.2 |

Caught Stealing | R/Outs * 2 + 0.075 |

*BB RV = Run Value for Walks

Now that we have our run values, we can scale them to create our wOBA formulation. It would be possible to create a version of wOBA with the unmodified run values, but by scaling them we put wOBA on a similar scale as OBP, aiding in our understanding of it. To scale the run values, we need to create two more building blocks, Run Plus and Run Minus. According to Tango, Run Plus gives us the “average run value of safe batting events” (meaning walks, hit by pitches, and hits). Run Minus gives us the value of events not included in the weights, such as sacrifice flies and errors. Both metrics are ratios with the same numerator, denoted as pRunMult (Player Run Multiple) in the accompanying R script. This number was calculated for each player by multiplying their stats by the league run values and summing them. After calculating the Run Plus and Minus denominators for each player, the league Run Plus and Minus was calculated by summing the league pRunMult and dividing it by the sums of the accompanying denominators.

We now need one final piece to create the wOBA weights, the wOBA scale. This combines the pieces discussed above and gives us a multiplier to scale each run value by. Its formula is 1/(Run Plus + Run Minus). Now we can calculate the wOBA weights. For each event except for stolen bases and caught stealing (which are not included in the wOBA calculation but still have their weights computed), the run value is added to the Run Minus value and multiplied by the wOBA scale. We now have our final weights and can compute weighted on-base average.

**The Results**

The wOBA weights for the KCL can be viewed in the table below. Note that these weights are dependent on the current league numbers, so they will change as the season goes along.

BB | HBP | 1B | 2B | 3B | HR | SB | CS |

0.89 | 0.92 | 1.07 | 1.43 | 1.75 | 2.05 | 0.24 | 0.77 |

Using the weights, we can finally compute wOBA for the league, multiplying each weight by each hitter’s stats, summing those values, and dividing by plate appearances. Before we can calculate the league average wOBA that will allow us to make observations about the individual players, we must establish a qualification. In general, baseball players are “qualified”, meaning they have enough plate appearances to represent a suitable sample size, if they have at least 3.1 plate appearances per team game. KCL teams only play 7-inning games, so this number is reduced to 2.4. However, given the developmentary nature of the KCL, very few players play enough to give them a qualifying number of at bats. Therefore, I believe it is fair to reduce this number by 10 plate appearances. At this current point in time, all four teams have played at least 16 games, giving us a plate appearance qualification of 28. With this number in hand, we can calculate league average wOBA and display the qualified hitters.

First | Last | Team | PA | AVG | OBP | SLG | OPS | wOBA | wOBAdiff | |
---|---|---|---|---|---|---|---|---|---|---|

1 | Mitch | Murphy | Bobcats | 36 | 0.32 | 0.528 | 0.52 | 1.048 | 0.558 | 0.139 |

2 | Mateo | Casillas | Ground Sloths | 52 | 0.405 | 0.519 | 0.524 | 1.043 | 0.555 | 0.136 |

3 | Shea | Zbrozek | Ground Sloths | 53 | 0.449 | 0.491 | 0.571 | 1.062 | 0.551 | 0.132 |

4 | Clay | Gadbois | Bobcats | 38 | 0.355 | 0.474 | 0.516 | 0.99 | 0.52 | 0.101 |

5 | Gage | Wolfe | Blue Caps | 37 | 0.357 | 0.486 | 0.5 | 0.986 | 0.519 | 0.1 |

6 | Kody | Morton | Merchants | 60 | 0.333 | 0.45 | 0.578 | 1.028 | 0.509 | 0.09 |

7 | Alec | McGinnis | Bobcats | 39 | 0.345 | 0.487 | 0.379 | 0.866 | 0.49 | 0.071 |

8 | Jimmy | Anderson | Ground Sloths | 48 | 0.25 | 0.438 | 0.5 | 0.938 | 0.485 | 0.066 |

9 | Bennett | Summers | Bobcats | 29 | 0.36 | 0.448 | 0.44 | 0.888 | 0.482 | 0.063 |

10 | Mason | Nauman | Merchants | 41 | 0.3 | 0.488 | 0.3 | 0.788 | 0.474 | 0.055 |

11 | Jake | Stewart | Merchants | 41 | 0.361 | 0.439 | 0.417 | 0.856 | 0.467 | 0.048 |

12 | Jake | Morrill | Blue Caps | 32 | 0.261 | 0.469 | 0.304 | 0.773 | 0.464 | 0.045 |

13 | Daniel | Bastidas | Merchants | 45 | 0.257 | 0.422 | 0.4 | 0.822 | 0.45 | 0.031 |

14 | Trey | Blanchette | Blue Caps | 36 | 0.313 | 0.389 | 0.469 | 0.858 | 0.443 | 0.024 |

15 | Clay | Conn | Merchants | 66 | 0.229 | 0.424 | 0.354 | 0.778 | 0.439 | 0.02 |

16 | Colin | Karr | Bobcats | 32 | 0.25 | 0.438 | 0.292 | 0.729 | 0.436 | 0.017 |

17 | Dan | Mosele | Bobcats | 34 | 0.345 | 0.412 | 0.379 | 0.791 | 0.431 | 0.012 |

18 | AJ | Weller | Blue Caps | 36 | 0.323 | 0.389 | 0.419 | 0.808 | 0.425 | 0.006 |

19 | John | Shuey | Blue Caps | 47 | 0.341 | 0.383 | 0.415 | 0.798 | 0.417 | -0.002 |

20 | Hayden | Stork | Bobcats | 43 | 0.27 | 0.372 | 0.405 | 0.777 | 0.416 | -0.003 |

21 | Ashton | Horchem | Merchants | 62 | 0.283 | 0.387 | 0.358 | 0.746 | 0.411 | -0.008 |

22 | Carter | Selk | Ground Sloths | 49 | 0.244 | 0.327 | 0.537 | 0.863 | 0.411 | -0.008 |

23 | Luca | Morelli | Merchants | 45 | 0.289 | 0.356 | 0.421 | 0.777 | 0.399 | -0.02 |

24 | Jackson | Smith | Bobcats | 30 | 0.28 | 0.4 | 0.28 | 0.68 | 0.398 | -0.021 |

25 | Nate | Reed | Merchants | 33 | 0.13 | 0.394 | 0.217 | 0.611 | 0.388 | -0.031 |

26 | Dalton | Hobick | Ground Sloths | 44 | 0.243 | 0.364 | 0.324 | 0.688 | 0.386 | -0.033 |

27 | Chase | Wiese | Bobcats | 40 | 0.219 | 0.375 | 0.281 | 0.656 | 0.382 | -0.037 |

28 | Nate | Cooley | Ground Sloths | 38 | 0.226 | 0.368 | 0.29 | 0.659 | 0.38 | -0.039 |

29 | Kannon | Kleine | Blue Caps | 39 | 0.233 | 0.359 | 0.333 | 0.692 | 0.379 | -0.04 |

30 | Blake | Crancer | Ground Sloths | 52 | 0.205 | 0.327 | 0.273 | 0.6 | 0.342 | -0.077 |

31 | Mark | Wagner | Bobcats | 31 | 0.25 | 0.323 | 0.286 | 0.608 | 0.34 | -0.079 |

32 | Nolan | Bowles | Blue Caps | 32 | 0.267 | 0.281 | 0.4 | 0.681 | 0.337 | -0.082 |

33 | Will | Applegate | Ground Sloths | 31 | 0.276 | 0.29 | 0.276 | 0.566 | 0.306 | -0.113 |

34 | Chris | Casey | Merchants | 34 | 0.2 | 0.294 | 0.233 | 0.527 | 0.305 | -0.114 |

35 | Jacob | Castlemen | Blue Caps | 39 | 0.222 | 0.256 | 0.25 | 0.506 | 0.274 | -0.145 |

36 | Nick | Mardis | Bobcats | 30 | 0.185 | 0.267 | 0.185 | 0.452 | 0.269 | -0.15 |

37 | Riley | Hendren | Blue Caps | 40 | 0.194 | 0.25 | 0.222 | 0.472 | 0.264 | -0.155 |

As you can see in the table, Mitch Murphy of the Bobcats and Mateo Casillas and Shea Zbrozek of the Ground Sloths lead the league with respective wOBAs of 0.558, 0.555, and 0.551. As you can also see, wOBA correlates fairly well with OPS, but it is not exact, pointing to the issues with OPS discussed earlier.

This qualified group gives us a league average wOBA of 0.419, and as you can see, most of the hitters have much higher wOBAs than what is typical of MLB hitters. This is not necessarily a positive evaluation of their performance, as the MLB is a much tougher environment to score runs in, given the stronger pitching and defense. This lack of context shows a weakness of wOBA not present in normalized statistics such as wRC+ or OPS+. In an attempt to correct this, or at least to provide another data point, the difference from the league average wOBA was calculated for each player and can be seen in the last column of the table above. This number is especially helpful for evaluating players such as AJ Weller (0.425 wOBA) and Hayden Stork (0.416 wOBA), who are both having great offensive seasons, but according to wOBA are both roughly league average.

wOBA is not perfect, but then again, no statistic really is. I believe it succeeds in its goal of providing an overall offensive statistic for run contribution without the inherent flaws of OBP, SLG, and others. I also believe it is an excellent statistic for evaluating a league such as the KCL.

**Future Work**

I’ll likely recompute the league wOBA later in the season and after playoffs conclude. Hopefully by the end of the season, there will be enough hitters that qualify under the regular rules for qualification that more conclusive results can be taken from the data. I also think it would be interesting to find potential differences between teams and develop a similar advanced statistic for KCL pitchers.

Thank you for reading and please reach out with any comments or questions you may have.