One thing that ticks me off in sports is when an announcer says one thing, the replay clearly shows something else, and the announcer refuses to back off. You see it all the time. An announcer (often Phil Simms) will explain a quarterback sack away by saying that there was nobody open downfield or praise a defender for covering a receiver without committing pass interference.
Then they will go to the replay and it will show two receivers waving their arms madly to indicate how open they are or it will show a cornerback absolutely mugging a receiver.
And the announcer will continue to explain the sack as no one being open and continue to say that the cornerback played perfect coverage.
I feel a little bit like that on this Porcello-Verlander thing. I have said on multiple occasions that, while it’s close, I would have voted Justin Verlander over Rick Porcello for Cy Young. I still believe the first part; it’s absurdly close. There’s no WRONG answer as to which pitcher deserved the Cy Young Award. None of the following words change that.
But I said I would have given the slight edge to Verlander because it seemed to me — at first glance — that he had the slightly better season as judged by the advanced stats, particularly Baseball Reference WAR.
So, upon further review, I am obliged to say this: I think Baseball Reference WAR in this case doesn’t pass the smell test.
* * *
Let’s start with this fairly obvious statement: Not very long ago, there would have been no debate whatsoever about who should win the 2016 American League Cy Young Award.
Porcello went 22-4.
Verlander went 16-9.
Those won-loss records would have ended all arguments.
Bill James did an interesting study on this that I think he will unveil any time now — he looked into the Cy Young voting to see how it has changed in regards to the emphasis voters put on win-loss records. I won’t give any spoilers here except to say that the data shows that up until 1990 or so it’s pretty clear that won-loss record was EASILY the most important factor in Cy Young voting.
Then, things began to very slowly change. Why? I think it came down to three things:
1. Pitchers stopped going deep into games and they made fewer starts, which naturally brought down win totals. The last 25 years, there have been 81 pitchers who won 20 or more in a season, and no one won 25. In the 25 years before that, 182 pitchers won 20 or more and 14 of them won 25-plus. With fewer wins, voters had to look elsewhere.
2. Some of the greatest pitchers in baseball history pitched in the 1990s — Greg Maddux, Roger Clemens, Randy Johnson, Pedro Martinez, etc. — but their greatness was rarely reflected by wins and losses. Maddux won 20 just twice. Johnson and Martinez both won Cy Young Awards with 17-win seasons, Clemens with an 18-win season.
3. New statistics came along that were better indicators of a pitchers skill than their won-loss record. And certain annoying people like yours truly and Brian Kenny began ranting against the pitcher win.
Over the last 10 or so years, the voters have often rebelled hard against the won-loss record. When Felix Hernandez won the Cy Young in 2010 with a 13-12 record (ower, among others, C.C. Sabathia, who went 21-7), the wall came a tumblin’ down.
This year’s Cy Young duel between Verlander and Porcello seemed just the latest battle between advanced statistics and pitcher won-loss record. That’s certainly how many people told the story … and I probably fell for that a little bit too.
There was just one problem with that story: Most of the advanced stats you looked at did not favor Verlander. It was tempting to make the 2016 Cy battle a lot like the 1999 battle when Mike Hampton went 24-4 and Randy Johnson went 17-9. But that one was very, very different. Unit dominated EVERY STATISTIC except won-loss record. His ERA was about a half run better. He had 200 more strikeouts while walking fewer batters. He pitched 32 more innings with a much lower WHIP. By Baseball Reference WAR he was two and a half wins better.
Verlander’s advantages are, well, considerably more subtle if they even exist. His ERA advantage (3.04 to 3.15) is negligible at best. Truth is, in context, when you consider ballpark, Porcello has the clear edge. Porcello’s ERA+ of 145 is better than Verlander’s 136.
Verlander did have 64 more strikeouts (and he led the league in Ks) but Porcello countered by having 25 fewer walks. It was Porcello who had the better strikeout-to-walk ratio. Verlander pitched just four more innings than Porcello, and Porcello completed one more game. Verlander had the slightest of edges in WHIP (.008 is hardly an edge) but Porcello gave up fewer home runs.
Verlander certainly played in front of a worse defense, but Porcello actually had the lower FIP which only considers strikeouts, walks and home runs allowed.
Fangraphs WAR, which builds around FIP, had them exactly even at 5.2 wins above replacement.
So, you will ask, why was there a perception that Verlander had the better advanced metrics season?
Answer: Baseball Reference.
Baseball Reference WAR
Now, let me pause here to say: Baseball Reference is a miracle. It is the joy of my life and the joy of most baseball writer’s lives. If forced to give up Baseball Reference or a family member, well, it would depend on which family member. But I am convinced that the main reason Justin Verlander got 14 first place Cy Young votes to Porcello’s 8 is because of that fairly sizable gap in Baseball Reference WAR. There might be other factors, but I would wager that this is by far the biggest one.
I say that because Baseball Reference WAR is absolutely the biggest reason I thought that Verlander had the better statistical season.
Hey, I check Baseball Reference WAR every single day of the season. Well, I’m on the site every single day — I imagine many baseball writers are on the site every single day — and WAR is on a front page box, updated constantly. That Verlander lead in Baseball WAR absolutely played in my mind all season long. Everything else abut the two pitchers was so close so for me it came down to Porcello’s won-loss record or Verlander’s 1.6 win edge on Baseball Reference.
Of course I chose Baseball Reference. I don’t judge pitchers by wins and losses.
But here’s the thing: I had NO IDEA WHY Verlander had such an edge in Baseball Reference WAR. And at some point it occurred to me: I should know why. So I stretched my mathematical understanding to their breaking point and looked more closely at it. And, um, I say this with love: I think the Baseball Reference WAR formula got it very wrong.
* * *
Let me give this final caveat out of respect to Sean and all the good folks at Baseball Reference: I might have messed up on my math here. It’s no secret that I am mathematically challenged. I did run my numbers by a couple of much smarter people, and they seemed to agree with what I’m saying. But if the basic takeaway from the numbers are wrong, I will certainly correct the error.
OK, let’s break down Baseball Reference WAR for Porcello and Verlander.
First: Baseball Reference calculates its WAR based on runs allowed and innings pitched. This is in contrast with Fangraphs which, as mentioned, builds its formula around strikeouts, walks and home runs allowed. Baseball Reference takes how many runs a pitcher has allowed (unearned AND earned runs) and then, after making a few adjustments, compares those runs to league average. The adjustments can be a bit complicated but the idea of comparing runs allowed to league average is simple.
OK, we start with runs allowed — and this includes unearned runs.
Porcello gave up 85 total runs in 223 innings. That’s 3.43 runs per nine innings.
Verlander gave up 81 runs in 227 2/3 innings. That’s 3.20 runs per nine innings.
Porcello gave up three more unearned runs than Verlander, which is why there’s a bigger gap here than between their ERAs. Next, we compare those runs allowed to the league average and here is what we get.
Porcello is 26 runs better than average.
Verlander is 33 runs better than average.
OK, perfect. Verlander is a little bit better. Next, there’s a small adjustment made based on whether the pitcher is a starter or reliever. You can read all the reasoning for this adjustment and all the others over at Baseball Reference. Porcello and Verlander are obviously both starters, so you add 4.5 runs to their total.
Porcello is 30.5 runs better than average.
Verlander is 37.5 runs better than average.
Easy enough. Next comes ballpark adjustment. Fenway Park was tough on pitchers, so Porcello gains 5.7 runs. Comerica Park, meanwhile, leaned slightly toward the pitcher and so Verlander has 1.2 runs knocked off his total. You can agree or disagree with these adjustments; Bill James, for one, believes the adjustments are too small.
Porcello is 36.2 runs better than average.
Verlander is 36.3 runs better than average.
Now let’s stop right here and marvel at how close the two pitchers are. This FEELS right to me. They had almost identical seasons when you consider all factors, and here you have the two pitchers within a tenth of a run of each other. If the formula stopped here, they would basically have the exact same Baseball Reference WAR. And if that was the case, I think Porcello would have won the Cy Young Award more convincingly.
But it doesn’t stop here. You are probably wondering what adjustment could come along that would separate the two pitchers by almost two full wins.
Yep. Defense. Even though Porcello gave up more unearned runs than Verlander, and even though Porcello’s batting average on balls in play was considerably higher (.269 to .256) and even though the Red Sox committed five more errors behind Porcello and threw out significantly fewer base stealers, the Baseball Info Solutions stats say that Boston was a much, much, much better defensive team than Detroit.
I should say here that overall I do believe wholeheartedly that Boston WAS a much, much, much better defensive team than Detroit. I just don’t know how that specifically affected these two pitchers.
The Baseball Reference WAR formula concludes it affected them a lot. I mean, seriously, A LOT. By my admittedly shaky calculations, Baseball Reference takes away NINE RUNS ABOVE AVERAGE from Porcello’s total and ADDS FOUR RUNS ABOVE AVERAGE to Verlander’s total.
And so, in the end, this is the final scoreboard:
Porcello is 27 runs above average
Verlander is 40 runs above average
Wow: That’s some gap now. And it is that 13-run difference that gives Verlander his 6.6 to 5.0 WAR edge. All 13 runs come from defensive adjustment.
Now, like I say, maybe I’m doing the math all wrong. Maybe defense doesn’t represent all 13 runs. But it unquestionably is the bulk of that 13-run difference. And, well, I’m just not buying it at all. Yes, I’m all for trying to isolate a pitcher’s contribution away from the defense’s. And I’m a big fan of Baseball Info Solutions. But this sort of massive defensive adjustment makes no sense to me.
For one thing, I think it’s quite likely that Detroit played EXCELLENT defense behind Verlander, even if they were shaky behind everyone else. I’m not sure how you can expect a defense to allow less than a .256 batting average on balls in play (the second-lowest of Verlander’s career and second lowest in the American League in 2016) or allow just three runners to reach on error all year (the lowest total of Verlander’s career).
For another, the biggest difference in the two defenses was in right and centerfield. The Red Sox centerfielder and rightfielder saved 44 runs, because Jackie Bradley and Mookie Betts are awesome. The Tigers centerfield and rightfielder cost 49 runs because Cameron Maybin, J.D. Martinez and a cast of thousands are not awesome.
But the Tigers outfield certainly didn’t cost Verlander. He allowed 216 fly balls in play, and only 16 were hits. Heck, the .568 average he allowed on line drives was the lowest in the American League. I find it almost impossible to believe that the Boston outfield would have done better than that.
What I’m saying here is that while the defensive adjustments seem shaky and unpersuasive, the stark final WAR number — 6.6 to 5.0 WAR — is there in your face. I don’t know how many people voted for Verlander because of Baseball WAR numbers, but I suspect at least a handful did.
And I wonder how many of them realized they were voting for a defensive adjustment. I love the concept of WAR, and I appreciate the efforts to make it better all the time. And I know the Baseball Reference people do not claim that it is the perfect statistic or that anyone should base their entire award ballot on it. But WAR does have real sway in the baseball commuinity. And in this case, I think it was pretty misleading.