baseball analytics and sabermetrics: February 2016

Friday, February 26, 2016

Does the current WAR system significanly overvalue SS's and undervalue Pitching & Catching? Using WAR to project Wins by Team and by Team position.

When I think of WAR, I tend to think of it truly in terms of "Wins". So when i see that a player is rated an 8 WAR player, to me I'm literally thinking this guy will get my team approximately 8 additional wins. Otherwise we should really just rename this "best player metric". Not that anything is wrong with a best player metric, but lets not try to "connect" it to wins, if it's not really connecting to wins right? So I wanted to see how accurate this really is. So I downloaded the team WAR data from fangraphs from 1985-2013 both hitting and pitching. I summed up the hitting & pitching WAR and plotted them versus the team's wins that year. Hoping for a strong correlation.

Hitting Stats Pitching Stats

You can see from the chart above, a correlation of 0.7525 was recorded. Great! This also shows a replacement level team is about a 46.5 win team. Not unreasonable. Things make sense.

So then I figured, maybe we could try to do this same drill, but instead of using complete team calculations, what if we used individual position components? Would that result in a more accurate result? It's possible, since the sum of a team's individual players WAR values is not necessarily representative of the team WAR calculation alone. So what would this look like? So i went to fangraphs again and downloaded the same dataset, except by position this time, instead of by team. For example, i've linked the catcher data below.

Catcher data

I went through built a comprehensive list, tagging each player's position. For pitchers the fangraphs link was comprehensive, so i determined the RP and SP tag by assigning anybody who had >75% of their games also be games-started, as a SP, and all others as RPs. In some cases players showed up in multiple categories (i.e. Mike Napoli was listed as a C and 1b in 2011). In those events, i simply equally split their total seasonal WAR evenly across however many positions. So if an 6 WAR player showed up as a C & 1b & DH in a single season, each position was credited with 2 WAR. This prevented double or triple counting of players. So how did this workout?

This actually projected slightly better. I do mean slightly 0.7559 R2 versus the 0.7525 R2 when viewed as just team hitting and pitching. It also predicted basically the same replacement level team, a 46 win one. So you could probably make the argument that it's slightly more accurate to try actually use the sum of the individual players WARs on the team instead of just a team calculation. But it is so close it's probably not worth the extra effort for most exercises.

This then led me to think, why not try to tie wins in as a multi-variable regression using all the positions individually instead of just a linear one where we come Wins to some singular WAR total?

Since i already had the data i gave it a shot.

You can see here that we actually arrive at an R2 of a bit above 76%. So this is ever so slightly more predictive again. Again you also see that the intercept ends up very close to other methods, at 45.4 Wins for a replacement level team. But bottom line, it's basically as accurate as the other approaches. However, what I do find interesting in this approach is that it actually appears to value RP highest and the SS position the lowest. And those values are substantial. Very substantial.

You could probably make the argument then that SS's are being overvalued by the present system. This could possibly mean the defensive position adjustment value for SS defense is too high. Reasons aside, this seems like a very legit finding, as the "WAR" metric appears to overstate SS value by 26.7% (1/0.789). So for example, a typical fangraphs contract analysis approach can use a standard $/WAR value for projections into the future. Yet from this perspective, spending that $/WAR on a SS will have you significantly overweighting the benefit you'll get from that SS. To a lesser extent that would also apply to 2b, CF and RFs.

Conversely, RP, SP and Catcher values are actually quite undervalued. This would certainly lend some credence to the approaches of "smaller" and "rebuilding" teams to date (think Royals and Astros, even last years Yankees) who have focused, among other things, on RP groups.

Based on this data, it would seem that focusing on pitching, specifically RP, and getting an excellent catcher, would be the best ways to focus on turning around a team.

While this wasn't what I went into this analysis looking for, it was a fairly surprising result. Yet one that seems to be in line with the approach many teams are currently taking.

I do understand this could be refined even further to re-weight the players WAR values exactly correctly based upon their actual number of games at each position instead of the approach i took which was just to equally distribute those values. Given the size of that specific sample and what type of change we'd be talking about, i would find it unlikely that would move the needle substantially here though. But i think it's an interesting finding.

Tuesday, February 16, 2016

Rookie Pitchers and the Strike Zone (part 2)

My earlier post detailed an analysis I performed to attempt to identify if rookie pitchers got treated differently than veteran pitchers regarding called strikes.

http://saber-fighters.blogspot.com/2016/02/rookie-pitchers-and-strikezone.html

While the result wasn't overwhelming, it was consistently shown that over the 3 years of data i analyzed (13-15) veterans were given a bit more leeway than rookie pitcher when it came to called strikes. One of the reasons I theorized was that this could be due to rookies and pitch counts. For example, maybe rookies were more likely to be in 3-0 counts, which would likely result in a high percentage strike likely in the 95%+ range. If the frequency of this situation was significantly greater for rookies then for veterans it could shift the total "strike likelihood percentage" I calculated for all rookies higher. So I set out to see if this was in fact true.

I took the same data, all the pitch fx data from 2013-2015, and this time also incorporated counts for balls and strikes. Below i'll show you the tables which represented the frequency of each count for rookies and veterans for each year.

Before I get to the data, i'll provide you the conclusion. It's pretty amazing how consistent this ended up being across all three years across both groups. I guess that's really part of the beauty of baseball and statistics, you get large enough sample sizes that things work out. So the bottom line would be that I wouldn't believe the specific count to any batter and some sort of inordinate amount of pitches thrown in either a heavy ball or heavy strike count to be a driver here at all. The frequency with which rookies and veterans find themselves to be in a similar count on any batter is remarkably similar.

This finding, or lack of a finding, would seem to indicate to me that it's much more likely that there does exist a pure, albeit small, bias against rookie pitchers.

2013 Rookies
	Strikes
Balls	0	1	2
0	48.4%	8.4%	1.5%	58.3%
1	14.6%	7.1%	2.1%	23.7%
2	5.3%	3.4%	2.0%	10.7%
3	3.9%	2.1%	1.2%	7.2%
	72.2%	21.0%	6.8%





2013 Veterans
	Strikes
Balls	0	1	2
0	49.4%	8.4%	1.4%	59.2%
1	14.0%	6.7%	2.2%	22.9%
2	5.4%	3.6%	2.2%	11.2%
3	3.4%	2.0%	1.3%	6.7%
	72.3%	20.7%	7.0%	100.0%

2014 rookies

	Rookies
	Y			Y Total
Row Labels	0	1	2
0	49.2%	8.2%	1.4%	58.9%
1	14.2%	6.7%	2.2%	23.1%
2	5.3%	3.6%	2.0%	10.9%
3	3.7%	2.2%	1.3%	7.2%
Grand Total	72.4%	20.8%	6.9%	100.0%

2014

	Veterans
	N			N Total
Row Labels	0	1	2
0	49.3%	8.5%	1.5%	59.3%
1	14.0%	6.7%	2.4%	23.2%
2	5.1%	3.5%	2.3%	10.9%
3	3.3%	2.0%	1.3%	6.7%
Grand Total	71.8%	20.7%	7.5%	100.0%

2015 Rookies
	Strikes
Balls	0	1	2
0	48.7%	8.3%	1.4%	58.4%
1	14.4%	6.6%	2.2%	23.1%
2	5.4%	3.5%	2.2%	11.1%
3	3.9%	2.3%	1.3%	7.4%
	72.3%	20.7%	7.0%	100.0%







2015 Veterans
	Strikes
Balls	0	1	2
0	49.6%	8.4%	1.5%	59.5%
1	13.8%	6.7%	2.3%	22.8%
2	5.1%	3.5%	2.3%	10.9%
3	3.5%	2.0%	1.3%	6.8%
	72.0%	20.6%	7.4%	100.0%

Saturday, February 13, 2016

Rookie pitchers and the strikezone

So my question is do Rookie pitchers get a similar treatment from umpires with regard to called strikes as do veteran pitchers?

In order to evaluate this question, I first had to develop a strikezone to evaluate. So using the Pitch F/X data from 2013, 2014, and 2015, I created a model of strikzone which was broken down into 1/10 of a foot increments and plotted the probability of a strike or ball being called when a pitch was thrown inside that range for all the Balls and Strikes Looking over those 3 years. I did separate strikezones for LHB and RHB since umpires should have a slightly different perspective depending on the batter's location.

The strikezones I arrived at are shown here:

RHB:

LHB:

Once the strikezones were determined I was able to go through the pitch f/x data and tag every pitch thrown which resulted in either a ball or called strike with the associated probability of a pitch in that location being called either a ball or strike.

This then allowed me to take any individual pitcher and calculate an average "strike" probability for his called strikes. As an example, here were my 2015 top 10 pitchers in terms of average strike likelihood (minimum pitches of 750 that were either Balls or Called Strikes).

pitcher	# called Strikes	Strike Likelihood % (SL%)
Dallas Keuchel	650	73.0%
A.J. Burnett	483	74.1%
Francisco Liriano	495	75.1%
Jon Lester	568	75.7%
Jesse Chavez	475	76.0%
Lance Lynn	445	76.0%
Jeff Locke	495	76.0%
Gio Gonzalez	541	76.3%
John Danks	498	76.4%
Charlie Morton	361	76.4%

The lower the % the better. This means that on average when Dallas Keuchel got a called strike over the course of the entire season, that pitch was only likely to be called a strike 73% of the time. To show the impact this could have, Stephen Strasburg in 2015 had 402 called strikes; however, his Strike Likelihood % was 86.5%.

So if Strasburg through a pitch into a zone where there was an 80% chance of that pitch being called a strike, he was unlikely to get that call while while if Keuchel or Jon Lester or Gio Gonzalez threw that same pitch they were very likely to get that call.

Strasburg is particularly interesting due to the fact that both him and Gio are on opposite sides of the spectrum. Since the first thing that would jump out to you is catcher framing as part of the delta. Looking at the top 10 list from 2015 for example you notice a lot of Pirates and of course Francisco Cervelli was loved by the catcher framing metrics this year. BP catcher metrics for '15

But catcher framing shouldn't really be a major issue in the evaluation of rookie versus veteran pitchers. It's unlikely rookies wouldn't be caught by the primary catcher.

My next step was to calculate the Rookie Strike Likelihood % for 2013, 2014 & 2015 and compare it to the Non-Rookie Strike Likelihood % for those same seasons to see if there was any "bias". I set my minimum total balls + called strike total to the 1st quartile value for that season. Remember the lower the SL% the better, this means a pitch can be "worse" and still called a strike.

2015 (135 minimum)

Non-Rookie SL% - 81.1%

Rookie SL% - 82.1%

2014 (114 minimum)

Non-Rookie SL% - 82.1%

Rookie SL% - 82.4%

2013 (166 minimum)

Non-Rookie SL% - 82.0% Rookie

Rookie SL% - 83.1%

So while the gap is not always "huge". There is in each year a delta in the SL% which favors the veteran pitchers.

What does this mean? This could mean nothing. It could be entirely due to rookies just not working the zone in the same way veterans do, or it could be related to the specific pitch selection (FB vs Curve vs Slider) and how those different pitches are typically located in the zone. It could be related to how often rookies are ahead vs behind in the count against batters and what that means for their next pitch location. Then again, it could just mean that there is some bias against rookies where they don't get a sort of "Jordan" impact where your reputation gets you a call that maybe you wouldn't have gotten without it. In all likelihood it is a combination of both. But given this seems to be a real thing, it could also be used in the evaluation again, of catcher framing metrics. Catchers who catch an abnormally high amount of rookies in a season could see their framing "skills" negatively impacted due to their counterpart alone and not a diminishing skill on their part.

Wednesday, February 3, 2016

I plan to use this blog as a place to do my own sabermetric analysis. Hopefully I will do some fun and interesting things! Fingers crossed.