Pro Quality. Fan Perspective.
Login-facebook
Around SBN: VIDEO: Watch I'll Have Another Win the Preakness Stakes

SB Nation Neyer's Wire

The Limits Of War or Ben Zobrist Isn't Really A Superstar?

Sep 6, 2011 - Over at It's About the Money Stupid, someone named Hippeaux leads this piece with the story of Tony Batista, who collected 110 RBI in 2004, while also posting a .272 on-base percentage. Batista is portrayed as the poster child for the meaninglessness of RBI; apparently we all learned a lesson from this ... but maybe not the right lesson?

Give me OBP, give me OPS, give me IPO, give me WPA, give me K/BB; just don’t give me RBI!  If you’re going to give me RBI, Mr. McCarver, I’d rather you gave me nothing.

And then came WAR.

The concept was ratified by the sabremetric Godfather, Bill James, who’d created Win Shares according to a similar ideology in 2002.  It was a neoclassical economist’s wet dream, like baseball GDP: an elegant equation which accounted for all the sport’s diverse variables and yielded a single number roughly reducible to the oldest and most hallowed statistic of them all, the win.  Hallelujah.

Wins Above Replacement is a beautiful idea.  Euclidean grace in a quantum world.  A simple answer, not only for age-old baseball conundrums like "Mantle or DiMaggio?", but also a formula for unprecedented comparisons like "Rickey Henderson v. Johnny Bench" and "Roy Halladay v. Alex Rodriguez".

There’s only one problem.  It doesn’t work.

It doesn't work? It doesn't work at all?

Oh, but that's not what Hippeaux is arguing. Because of course you can't argue that. See, he begins the piece with an indictment of RBI. Then launches into an indictment of Wins Above Replacement. But which works better? Who's going to win more baseball games? A lineup composed of the top nine RBI guys in the majors, or the top nine WAR guys? I'll take the WAR guys, and I think Hippeaux would, too.

No, what he means is that it doesn't work as well as we think it does.

At least, not yet.  Not in the fantastically straight-forward way we try to use it.  The idea is so good, so clarifying – like democracy or the rational market – that we really, really want it to work, we’re willing to suspend our disbelief just a little while longer in the hope that it might. Because it’d be so great to know with statistical certainty that Albert Pujols was worth $200 Million, that we really couldn’t win that pennant without Andy Pettitte, that Jacoby Ellsbury is definitely the AL MVP, and that Ben Zobrist is exactly 9.3% better than Adrian Gonzalez. Darn that dream.

The cruel irony, the I-could’ve-had-Sean-Doolittle-and-all-I-got-was-stupid-Barry-Zito irony, is that the problem with WAR is the same as the problem with RBI.  It frequently measures context as much as performance. Especially when used to evaluate single seasons, it doesn’t sufficiently account for the inevitable variations in opportunity and environment.

Of course WAR is sometimes (or frequently) context-driven. Of course it doesn't always "sufficiently account for the inevitable variations in opportunity and environment."

When Hippeaux says WAR doesn't work "in the fantastically straight-forward way we try to use it," what he really means is that it doesn't work in the way you -- that is, you dunder-headed fools who aren't as smart as Hippeaux -- try to use it.

Can you say "straw man"?

Who's trying to use WAR that way, exactly? Who is arguing that Ben Zobrist is EXACTLY 9.3 percent better than Adrian Gonzalez? None of my friends are doing that. Are yours?

As Hippeaux correctly notes, "WAR’s move to the mainstream is deeply tied to the rising popularity of FanGraphs."

When you go to the individual "leaders" pages on FanGraphs, the default ranking is according to Wins Above Replacement. A lot of people -- including a lot of baseball writers -- use those pages, and are informed by them. But very few baseball writers are going to just copy those rankings onto their MVP ballots. Dave Cameron is as responsible as anyone for what happens on FanGraphs ... But is Dave Cameron going around arguing that Ben Zobrist is EXACTLY 9.3 percent better Adrian Gonzalez? I really really really doubt it.

In fact, Hippeaux does acknowledge Cameron's perspective:

Even WAR’s adherents, like Dave Cameron, generally admit the margin of error is at least 15%.  When we stubbornly suggest that 0.5 WAR means anything, we are grossly exaggerating the statistic’s accuracy, even according to its creators.  It remains true that any reasoned discussion of an individual’s contributions still requires analysis of the various components that go into WAR, as well as several that don’t, and, as such, subjectivity reigns.

There he goes again with the we.

You know what makes me want to poke someone in the eye with a sharp stick? When someone writes we when he means you ... and isn't honest enough or brave enough to identify you. Are there particular writers or broadcasters who have been abusing WAR this season? Is someone out there actually suggesting that Carlos Lee is just as valuable on defense as Troy Tulowitzki?

Because if nobody is doing those things, then Hippeaux is wasting his considerable talents against a straw man. If somebody is doing those things, then Hippeaux should tell us who. Otherwise we can only assume that he's engaging in mere sophistry.

Hippeaux does make some good points, which I have not excerpted because I can't excerpt everything. Single-season UZR's can be terribly misleading, which everyone's known for a long time. FanGraphs does a poor job with catchers' defense, which was excusable for a long time, but maybe isn't anymore, considering the PITCHf/x work that's been done recently. But considering only hitting statistics and positional scarcity, WAR is a great starting point for any discussion of a player's value. And the defensive stuff is generally reliable, too. As long as you don't go nuts and assume that one season of data tells you everything.

My advice to would-be iconoclasts like Hippeaux? Be specific in your criticisms, with examples of actual people who are making the actual mistakes you say we are making.

This might have been a compelling article if Hippeaux had restricted his critique to Ultimate Zone Ratings and the analysis of catchers' defensive value. But with the unsupported critique of what we are supposedly doing, what might have been compelling becomes merely useful. Assuming you can see past the intellectual bankruptcy.

See? THAT is being specific.

Do you like this post?

Head_medium

Rob Neyer

National Baseball Editor

Rob Neyer began his career with legendary baseball author Bill James, and later worked for STATS, Inc. and ESPN.com, writing more words for that website than anyone else. Rob has written or... Read full bio


Comments

Display:

An excellent rebuttal.

Thanks.

Billy Hayes: His job is better than yours.
@productiveouts | Productive Outs

by delorean on Sep 6, 2011 2:18 PM EDT reply actions  

Only quibble

is with the post’s title. “The Fog of WAR” would have been so much more epic, yes?

by CSFreeman on Sep 6, 2011 2:18 PM EDT reply actions  

In my humble opinion, WAR is definitely a work in progress, and articles like Hippeaux’s that look at it intelligently, and critique specific aspects are exactly what the WAR concept needs to get better. Rather than the Joe Morgan approach of just dismissing the whole thing, Hippeaux is looking at what parts need to be improved.

Also, I think it’s entirely possible that when Hippeaux says “we” he means “we.” As in, he had up until very recently been making the mistakes he’s discussing, and trying to explain WHY he was making those mistakes.

I think Rob is being a little overly sensitive here.

by Dave Pomerantz on Sep 6, 2011 2:19 PM EDT reply actions  

Hippeaux really means WE?

Then (again) he should have been specific. Should have cited a specific example of Hippeaux using WAR incorrectly.

You can read his mind if you like. I’m not smart enough to do that.

by Rob Neyer on Sep 6, 2011 2:23 PM EDT up reply actions  

Seems to me...

You’re reading his mind more than I am. I’m taking what he says at face value, though I agree he could’ve been more specific. You’re the one replacing words :)

The main thrust of Hippeaux’s article is the flaws in UZR to measure the defensive component of WAR. Most of the stuff using the straw man you’re discussing is exposition.

by Dave Pomerantz on Sep 6, 2011 2:29 PM EDT up reply actions  

I agree

I generally like WAR and cite it a lot in discussions, but I think it has its limits. Hippeaux’s article does a great job of poitning them out in my opinion. And yes, he is correct in saying “we” because WAR is used by just about everyone, sometimes in situations where it’s really not the final answer.

Are Pedroia and Ellsbury really almost as valuable as Bautista and Verlander? Are they slightly more valuable than Adrian Gonzalez? Most people would probably say no, but right now that’s what BB-Ref’s AL WAR rankings say. How much does the presence of A-Gone (plus Ortiz and Youkilis) have to do with Pedroia and Ellsbury’s performance this season? WAR doesn’t tell you this.

Over a longer time frame, I think WAR gets more accurate (for example: the clear declaration that Jeff Mathis should not be on a MLB team); but for single-season and cross-position comparisons (SS-1B for example), I don’t think it’s quite as accurate as it appears.

Scioscialist Party of America - Redistributing your defense since 2000.

by Commander_Nate on Sep 6, 2011 2:46 PM EDT up reply actions  

I thought WAR didn't have the teammate problems of RBI.

How does Ortiz and Gonzalez peformance increase Pedroia’s WAR? Unless your talking about lineup protection which has been debunked.

by TMS71 on Sep 6, 2011 3:05 PM EDT up reply actions   3 recs

And I was under the impression that the whole point of WAR

was making cross-positional comparisons. I’m New SABR, not True SABR, so I may be wrong – but I was pretty sure that was the case.

In a past life, I was called fightoffyourdemons.

I write a bit for The Short Fuse.

Twitter: twach1441

by Thomas Wachtel on Sep 6, 2011 7:33 PM EDT up reply actions  

I disagree.

Quoting:

However, for some reason, Tony Batista became a sabremetric icon, our favorite cause celebre when we rage, rage against the RBI.

Rob’s assertion is fine. If Hippeaux wants to point out specific problems with specific sabermetricians or specific sabermetric articles, he/she is perfectly capable of doing so. The vagueness of using “we” is a straw man.

Between you and Rob, you are doing more word replacing.

by jwiscarson on Sep 6, 2011 3:10 PM EDT up reply actions  

This’ll do.

Game's the same, just got more fierce.

by Sam Miller OCR on Sep 6, 2011 2:20 PM EDT reply actions  

OT

nominating you for the “best sig on SBN” award

Have you seen my son's velocity?

Pithy.

by Lies and Perfidy on Sep 6, 2011 2:25 PM EDT up reply actions  

Single-season UZR's can be terribly misleading, which everyone's known for a long time.

How does that square with your statement at the end of the paragraph:

And the defensive stuff is generally reliable, too. As long as you don’t go nuts and assume that one season of data tells you everything.

Isn’t that what the FanGraphs WAR implementation does, though? Uses single-season UZR as its quantification of fielding performance? At least according to the FanGraphs glossary, it does:

WAR Part Two – Fielding

In the Win Values calculations here, Fielding is fairly straight forward – it’s simply a player’s total UZR at all positions for the given year.

by Mike Fast on Sep 6, 2011 2:25 PM EDT reply actions  

I was unclear about this too, Rob

“As long as you don’t go nuts [about UZR] and assume that one season of data tells you everything.”

What do you mean by “everything”? If you mean one year of UZR isn’t PREDICTIVE of a player’s true talent level, I’d agree with you. But then again, neither is a .330 batting average or 1.40 WHIP predictive of a player’s true hitting or pitching level.

But are you saying that UZR is DESCRIPTIVELY inaccurate, that it fails to capture even a non-catcher’s defensive contribution? If so, what is your reasoning for doing so?

Not saying I disagree with you, I’d just like to know where we are on critiquing and improving the validity and accuracy of the metric.

by PortlandYankee on Sep 6, 2011 2:31 PM EDT up reply actions  

UZR

takes the number of balls that a fielder got to and what he did with them, which are objective and verifiable facts, and combines that with an estimate of the locations/types of all batted balls and estimate of the expected number of outs for that fielder on similar types/locations of all batted balls.

That estimate of expected outs, which is what UZR purports to bring to the table over and above a metric like adjusted range factor, has some uncertainty (and some people, me included, would say it has a lot of uncertainty) in a descriptive function.

In fact, if those estimated of expected outs are biased in a systematic and repeatable way, which there is good evidence that they are, then UZR may actually function better in predicting future UZR than in accurately describing past fielding performance.

by Mike Fast on Sep 6, 2011 2:38 PM EDT up reply actions   1 recs

And to chime in...

…the author of the article actually points out what is potentially interesting evidence of such a bias, when he looked at the correlation between team OF UZR and team FB%.

by cwyers on Sep 6, 2011 2:39 PM EDT up reply actions  

When I read the "we" ...

I didn’t need a specific. I’ve seen plenty of comments at McCovey Chronicles or elsewhere with WAR being used as a blunt object. Player X has 3.1 WAR, and Player Y has 2.5, so GM Z really screwed up. I don’t think it would add to the discussion to call out LouBrockLesnar69 for his silly usage of WAR.

The use of UZR makes WAR pretty dangerous as a predictive stat, which isn’t what it’s meant to be, but it’s how some people use it. Maybe the article wouldn’t have pushed that button if it used “some people” instead of “we.” I don’t think the overall point would have changed much, though.

by Grant Brisbee on Sep 6, 2011 3:05 PM EDT reply actions   1 recs

"Some people" is just as bad.

If Hippeaux meant commenters on blogs, he should have said so. But if that’s our standard, we can criticize everything, right? Because everything that’s ever been invented is abused every day in a comment, somewhere.

Is WAR not predictive? Don’t most of the players with big WAR last season also have big WAR this season?

I’m really asking. I’ll bet WAR correlates pretty well, season to season.

by Rob Neyer on Sep 6, 2011 3:09 PM EDT up reply actions  

Can I phrase the question a bit differently?

You ask, “Is WAR not predictive?” I ask, “what is WAR predictive of?” Looking at the correlation between WAR in year one and WAR in year two only expresses the validity of WAR in predicting WAR – there is still the question as to whether or not WAR is measuring player value. If WAR is wrong in a consistent fashion, then it will increase the correlation while actually detracting from our understanding of value.

As a for instance, let’s look at the home-road splits of UZR from 2006 through 2009, as published on Fangraphs. Looking at Red Sox center fielders over that time period as a group, they put up -36.5 fewer runs saved (as measured by UZR) at home than on the road. This is after adjusting for home field advantage in fielding, per MGL (the man responsible for UZR). Is it reasonable to suspect that UZR is having a problem measuring defense in center field at Fenway? Theo Epstein certainly thought so:

I know there is a certain number we don’t use that is accessible to people online that had him as one of the worst defensive center fielders in baseball last year. I don’t think it’s worth anything. I don’t think that number is legitimate. We do our own stuff and it showed that he is above average.

Is it possible that this is due to random chance? Potentially. I should say that I didn’t pick the worst possible example; Cubs left fielders were 62.4 runs better at home than on the road, for instance. Sox CFers are only the ninth-most extreme home/road split in UZR of the time period being measured. But it’s also possible that these sorts of issues are caused by park-based biases, is it not? I’ve done extensive research into bias in batted ball scoring and other stringer-collected data, and it seems that there are park biases caused by things such as the position of the press box.

And these will be persistent errors – if there is something that is causing UZR to systemically overrate Cubs LFers, it won’t make Soriano’s WAR any less persistent, it will just make it less accurate.

by cwyers on Sep 6, 2011 3:31 PM EDT up reply actions   1 recs

Didn't Theo and the Red Sox

move Ellsbury to LF after that season? Sort of conveniently?

by UZR Illusion on Sep 7, 2011 8:15 AM EDT up reply actions  

If Hippeaux meant commenters on blogs, he should have said so. But if that’s our standard, we can criticize everything, right? Because everything that’s ever been invented is abused every day in a comment, somewhere.

“There are those who seem like intelligent people, who have at least a rudimentary understanding of sabermetrics and the empirical side of baseball, but who get bogged down in treating the computation of WAR as an exact science. Here are three examples I’ve found (here, here, and here), and though three examples aren’t enough to make this anything but anecdotal, trust me, I’ve read it several times in several places.”

I don’t think that would have made for a better article.

by Grant Brisbee on Sep 6, 2011 4:26 PM EDT up reply actions  

I said this below, as well.

Who’s this article targeting? Is Hippeaux complaining about sabermetricians, journalists, bloggers, commenters, or the average fan? Which of these (or some other category) is the target of this article?

I think you can level specific criticisms about the way that sabermetricians use WAR, and about how good of a job various sites do in explaining what these statistics mean, without pointing the finger at a specific fan or commenter on the use of WAR.

by jwiscarson on Sep 6, 2011 3:28 PM EDT up reply actions  

Who is advocating calling out random commenters on blogs? I, for one, will always prefer specificity and detail in articles I read over vagueness and generalities. It’s not a bad thing to hold our writers to high standards.

by ldd233 on Sep 7, 2011 11:47 AM EDT up reply actions  

Straw Man
Who is arguing that Ben Zobrist is EXACTLY 9.3 percent better than Adrian Gonzalez? None of my friends are doing that. Are yours?


Not specifically, but my friends have conversations that are similar to the Ben Zobrist/Adrian Gonzalez comparison. Hippeaux doesn’t know me or know that my friends engage in these types of discussions. However, they do happen. Does that turn his straw man into a real man? Does it matter?

by Callum Hughson on Sep 6, 2011 3:13 PM EDT reply actions  

Yeah, those are two separate issues.

If you say, “sabermetricians need to do a better job explaining how WAR should be used, because I don’t think the average person understands this.”, that is totally different from saying, “sabermetricians do not understand how WAR should be used.”

Given that he lumps himself in with sabermetricians in the opening few paragraphs, and then continually uses “we” throughout the article, it presents a problem.

by jwiscarson on Sep 6, 2011 3:23 PM EDT up reply actions   1 recs

I don’t think it’s “sabermetricians do not understand how WAR should be used” so much as “sabermetricians sometimes overstate both the accuracy and precision of WAR”

Sharlon Schoop - honkbalspeler extraordinaire.
Trolls are like cockroach Nazis. Sure, you CAN try to reason with them, but they won't listen, and if you respond to them, they invade your Sudetenland.
Or something.
That metaphor got away from me.

by Viliphied on Sep 8, 2011 3:21 PM EDT up reply actions  

WAR reverse discriminates against slow, fat, sluggers

Also, it overrates Carlos Lee and underrates Troy Tulowitzki.

That is all.

by dnc on Sep 6, 2011 3:25 PM EDT reply actions  

even more so

WAR adores Dustin Pedroia’s and Troy Tulowizki’s defense while abhoring Michael Young’s.

Matt Kemp shows up below 0 and Jhonny Peralta as above.

Starlin Castro below 0 and Adrian Gonzalez above.

Why don't you have a nice big cup of shut the fuck up? - Lisa W 3/4/2011

by iblum on Sep 6, 2011 4:11 PM EDT up reply actions  

I think both of you guys missed my point

Hippeaux’s piece made both of the contradictory claims I made above.

by dnc on Sep 6, 2011 7:37 PM EDT up reply actions  

those are not contradictory

think of it this way “WAR underrates slow fat sluggers” and “WAR weighs UZR too highly when it’s largely unknown how accurate UZR really is” are not contradictory statements.

Sharlon Schoop - honkbalspeler extraordinaire.
Trolls are like cockroach Nazis. Sure, you CAN try to reason with them, but they won't listen, and if you respond to them, they invade your Sudetenland.
Or something.
That metaphor got away from me.

by Viliphied on Sep 8, 2011 3:25 PM EDT up reply actions  

Hippeaux's implication is wrong.

If he wanted to write a post about the shortcomings of WAR he should have just done so. If he wanted to write about the misuse of WAR he could have done that. Instead he framed it as a takedown of WAR. Really misplaced emphasis I think.

by TMS71 on Sep 6, 2011 3:30 PM EDT reply actions   2 recs

Problems with UZR

Yes, they absolutely exist. Just as they exist with every other metric out there. But UZR not working for Red Sox centerfielders is a reason to disregard WAR?

I don’t think so. Especially because — as Tango points out over on his site — WAR is merely a framework. It’s okay to criticize the framework and it’s okay to criticize the actual inputs. But those are separate arguments, and Hippeaux has conflated them in a way that makes his piece less than compelling.

In my so-humble opinion, of course.

by Rob Neyer on Sep 6, 2011 4:03 PM EDT reply actions  

Is the point that sabermetricians should be drawing from his article how well-written, well-argued, or compelling it is?

Or should they be examining some of the claims he made on the evidence, particularly the one about OF UZR and team FB%?

It’s easy to bash his argument and identify and pick apart flaws. Great. But if we want people to do some research into fielding metrics, and I would argue that we need far more work like Hippeaux’s in that field than we’ve had to date, we ought to treat his work in that seriously, even if you don’t like the package it came in.

by Mike Fast on Sep 6, 2011 4:10 PM EDT up reply actions  

I'm a bit baffled by this line of (counter-)argument.

You’re right to point out the merits of Hippeaux’s approach to UZR (as is Colin below), but I’m not sure how or why that justifies the WAR "wrapper" of the analysis. That’s especially true since, as Tango and others have noted, there are multiple implementations of the WAR framework, and Hippeaux is focusing almost exclusively on fWAR largely because of UZR.

I get that you (and Colin) are encouraging people to take a critical stance on UZR (and WAR), rather than just accepting it as gospel, straight from The (good) Book. I think you’re more than justified in wanting that. From where I sit, there are very real concerns about fWAR’s components and their calculations. You and Colin, among others, have done a pretty damn good job drawing attention to those concerns, and suggesting ways we (pace Rob) might need to rethink our (sorry again, Rob) approach to implementing the WAR framework.

But that’s not really what Hippeaux’s piece emphasized, is it? Wouldn’t it have been much more effective by focusing on the original findings about UZR, batted balls, and outfield defense, rather than trying to prosecute a broader war on WAR (hat tip, Wilco) and producing more heat than light as a result?

Dark Luck Dragon of the Sith

by Darth Snark on Sep 6, 2011 5:00 PM EDT up reply actions  

To quote from what Colin said on Twitter

Vast swaths of Hippeaux’s article were dead damn wrong. But I guess that doesn’t bug me very much. People are wrong on the Internet all the time. I don’t have to argue with all of them.

Is the primary need for truth in baseball analysis that WAR as it exists must be defended? From the reaction around the sabersphere, it sure would seem so. But I don’t think so. If WAR is an accurate way to look things, eventually it will defend itself.

I see the primary need for truth in baseball as improving WAR. (Well, not the primary need, but the primary need as relates to the topic of Hippeaux’s article and the subsequent discussion.) And Hippeaux had a least one good specific idea about that, if not more.

I do think it’s worth noting to him that he misunderstood how the positional adjustment figures into defensive evaluation. But that’s a, “Hey, you misunderstood this” aside, not all the screeds that have popped up on the Book blog and elsewhere attacking him.

(Another parenthetical—I have problems with BP and B-Ref’s implementations of WAR, too.)

The best way to deflect criticism of metrics is to fix their glaring flaws. That won’t eliminate criticism, and that won’t ever be a finished task, but it will sure work a lot better than jumping all over critics for not having perfect saber understanding when they attack flawed but sacred saber cows.

by Mike Fast on Sep 6, 2011 5:25 PM EDT up reply actions  

I agree with you on some of this.

Hippeaux’s good, specific idea, for example, as I mentioned above.

It seems to me, though, there’s a herd of cows, and Hippeaux’s article is aiming to tip one really big one when it really seems better-suited to tipping another only-slightly smaller one. That shouldn’t mean he gets drawn-and-quartered by the saber-shepherds*, though given the tone of the original piece, I think in some ways they could be seen as replying in kind.

I’m all for critiques of WAR of whatever initial lower-case letter. Misunderstandings will happen, though I’m no longer so sure there was a misunderstanding here, as opposed to intentional exaggeration with the intention of drawing eyeballs. That doesn’t seem like the sort of analytical approach you’d back, which is part of why I was mildly surprised by your comments.

I’m also not sure what you mean about WAR defending itself — wouldn’t it also attack itself, then? Haven’t many stats lasted a very long time despite not being accurate ways to look at things, and having come under repeated attack from intelligent, eloquent commenters?

*I know, that should be “cowherds,” but then that risks alluding to Colin (bad) and I lose the alliteration (worse). Who’s scruffy-looking?

Dark Luck Dragon of the Sith

by Darth Snark on Sep 6, 2011 5:37 PM EDT up reply actions  

Re WAR defending itself

I’m not asserting that all bad stats will eventually be completely eradicated. (But the best ideas do gain headway over time.)

I’m saying that implementations of WAR might do well to take a lesson from the demise of Win Shares, which got a lot of things right, but ultimately had flaws that took it down and prevented it from being more widely accepted.

If we went back in time 7-8 years and Hippeaux had written a piece of similar quality about Win Shares, with a few nuggets accurately pointing out problems, would the most productive response for Bill James have been to point out all the flaws in his argument and his lack of persuasive writing skills?

by Mike Fast on Sep 6, 2011 5:53 PM EDT up reply actions  

I completely agree with your second paragraph.

In answer to your Bill James hypothetical, I’d answer “yes” on the flaws in his argument, and “no” on his lack of persuasive writing skills.

I do think that Hippeaux’s mistakes/misstatements about fWAR undercut his overall argument, and I don’t see any reason for people to ignore them because his point about UZR is worth examination. I think it’s a shame that the UZR argument is getting buried under the back-and-forth over WAR, but if he’d started with the former and used it to build to the latter, that seems like it would’ve been most productive. As it stands — or at least, as I read it — the way in which the overall argument was advanced opened itself up to the sort of critique it’s receiving. I think there’s a larger issue in saberism at play here…

Dark Luck Dragon of the Sith

by Darth Snark on Sep 6, 2011 6:12 PM EDT up reply actions  

Yes

I think I agree with everything you wrote there, except that I would say the reason for people to ignore his misstatements is for their own benefit. But of course there is no requirement for people to do that.

by Mike Fast on Sep 6, 2011 6:36 PM EDT up reply actions  

We could ask that question about anything, of course.
Problems with pitcher win-loss records? Yes, they absolutely exist. Just as they exist with every other metric out there. But W-L not working for Mariners starting pitchers is a reason to disregard W-L?
Problems with RBIs? Yes, they absolutely exist. Just as they exist with every other metric out there. But RBIs not working for Jim Rice is a reason to disregard RBIs?

I don’t see how this advances our understanding. Isn’t it better to ask what the problems are with UZR and how we can improve upon UZR to mitigate those problems?

by cwyers on Sep 6, 2011 4:28 PM EDT up reply actions  

I don’t love Hippeaux’s post

Rather than provide a long exposition as to why, I’ll focus on one specific piece that bothered me, as I’ve seen many (usually blog commenters) make similar statements.

Hippeaux writes:

But while it isn’t much of a stretch to believe that Zobrist’s glove was worth a couple wins to the Rays in 2009, try selling this: According to WAR, in 2011, Carlos Lee has had as much defensive value as Troy Tulowitzki.

Yes, Lee and Tulo have near identical UZRs. But they also play positions at opposite ends of the defensive spectrum. Lee’s UZR (9.2) + positional adjustment (-7.8) = +1.4. Meanwhile, Tulo’s UZR (9.1) + positional adjustment (+6.0) = +15.1. The two most definitely do not have the same defensive value, as measured by the statistic at least. This concept of adjusting for position is crucial to the WAR framework and the writer has ignored it, both here and in the “WAR Hates Sluggers” paragraph.

by James Kannengieser on Sep 6, 2011 4:14 PM EDT reply actions   3 recs

this cannot be understated

maybe the issue is that most people look at the dashboard and say,

“Votto, 6.5 runs positive, good fielder, Granderson, 5.9 runs negative, bad fielder”

The truth is that fielding is based on position, and if you want to compare two players fielding, you have to include position. so Brett Gardner is a fantastic fielder, compared to other left fielders. but he pays for it because he’s a left fielder.

Why don't you have a nice big cup of shut the fuck up? - Lisa W 3/4/2011

by iblum on Sep 6, 2011 4:21 PM EDT up reply actions  

WAR is a work in progress.

Anyone who says otherwise, either hasn’t spent enough time trying to understand the stat, or as Rob points out, is trying to construct a man from straw.

Unfortunately, at the moment, not just Hippeaux, but many big names with major media outlets fall into both of these categories.

by untilthebombs on Sep 6, 2011 4:37 PM EDT reply actions  

Did anyone bother to actually look...

…at the top 9 rbi guys vs the top 9 WAR guys? I did.

I think the top 9 RBIs actually would win most of the games.

by Dale Sams on Sep 6, 2011 5:54 PM EDT reply actions  

Really? Let's see the lists!

’cause I love being wrong. Keeps me humble(ish).

by Rob Neyer on Sep 6, 2011 6:08 PM EDT up reply actions  

For perspective

All-RBI team (there were some positional scarcity issues, so I had to ignore some 1B & dig for a C; I went with the top producer at each position, not who I liked better, and I moved around players with recent experience at another position):

C V-Mart
1B Fielder
2B Cano
3B Bautista (RF who can play 3B)
SS Tulo
RF Kemp (CF who can play RF)
CF Granderson
LF Braun
DH Howard

All-WAR Team:
C Avila
1B Votto
2B Pedroia
3B Bautista
SS Tulo
RF J. Upton
CF Granderson
LF Ellsbury (CF who can play LF)
DH Kemp

Okay, so there’s a little bit of agreement (Bautista, Tulo, Granderson & Kem pmake both teams)

by PortlandYankee on Sep 6, 2011 6:09 PM EDT up reply actions  

He has 82...next closest is Avila with 68.

At least by Fangraphs, both Martinez & Napoli are still considered “catchers”.

by PortlandYankee on Sep 6, 2011 6:18 PM EDT up reply actions  

Napoli has started as catcher in a little under half of his 95 games.

Martinez has 26 starts out of 124 games. It does avoid putting Montero (73 RBI; 116 starts at C out of 122 games) on the first list — did you miss him, or am I missing something?

Dark Luck Dragon of the Sith

by Darth Snark on Sep 6, 2011 6:22 PM EDT up reply actions  

Nope, I missed him

FWIW, Montero is not an offensive upgrade over V-Mart, and C defense is pretty clustered.

What is your main criticism with the choice? Sure, V-Mart is not the everyday C, just like Bautista is not the everyday 3B. Are you saying I’m overselling the RBI team, or underselling it?

by PortlandYankee on Sep 6, 2011 6:25 PM EDT up reply actions  

I don't know that it's a criticism, as such.

I’m not sure which way it would go, since as you point out, Martinez tends to bring rather more value to the plate than Montero, but Montero is more valuable behind it. I’m not sure I agree with you about catcher defense, since I’ve yet to convince myself any of the stats have a handle on it, and I’m no scout.

I just found the Martinez choice interesting, since I think of him as a DH, especially since I think he’s caught something like three games since the All-Star break.

Dark Luck Dragon of the Sith

by Darth Snark on Sep 6, 2011 6:30 PM EDT up reply actions  

For what it's worth, if you look at the numbers the WAR team is better

The reasons for this are obvious: WAR does a better job of capturing a player’s overall value (contact & power, accuracy & range in the field, baserunning) than RBI, which is heavily dependent on context & teammates and only emphasizes certain aspects of hitting (particularly power).

If you take the teams above:
-Baserunning: RBI team -5.2 cumulative vs. 5.4 for All-WAR. Howard, V-Mart & Fielder all cost their team on the basepaths (by failing to steal, or to advance on hits).
-Fielding (eliminating the DH’s values): Least surprising is that the RBI team has a -21.8, while the All-WAR has a 48.3. Even if you claim that UZR is flawed (which it may be), Votto, Ellsbury & Upton are far better defenders than Braun, Fielder & Kemp.

Most surprising, however:
-Batting: 331.1 batting runs above average for the All-RBI vs. 346.7 for All-WAR. The All-WAR team is actually a better hitting team than the All-RBI team.

If you say “but you adjusted for positions!” the disparity gets even worse…because Bautista is only 11th in RBI this year.

In short, if you are going to say that WAR is a stat that is flawed at the margins, fine that’s true…that’s true of all-stats. But I challenge anyone to find a stat that is better at consistently describing a player’s overall contribution to their team.

by PortlandYankee on Sep 6, 2011 6:47 PM EDT up reply actions   2 recs

Thanks Portland...

…for doing all that legwork. But let’s at least admit that the all RBI team this year is a pretty damn good team…especially if we somehow found a way to fit Teixiera and AGon in.

by Dale Sams on Sep 6, 2011 10:22 PM EDT up reply actions  

Probably to late to get any more conversation here...

But you said:
“In short, if you are going to say that WAR is a stat that is flawed at the margins, fine that’s true…that’s true of all-stats. But I challenge anyone to find a stat that is better at consistently describing a player’s overall contribution to their team.”

My reading of Hippeaux’s article was that he agrees with you completely, and just wants to work on the marginal flaws, to improve the metric. His clarification article today expands upon that.

by Dave Pomerantz on Sep 7, 2011 12:45 PM EDT up reply actions  

I’d argue that you don’t get to pick and choose your guys based on position. That is just going to make it look closer. It isn’t WARs fault that RBI don’t do anythign to put SS/2B/3B/CF/C on a similar footing as DH/1B/LF/RF.

That said:

WAR
C Granderson
1B Votto
2B Pedroia
SS Tulo
3B Bautista
LF Kemp
RF Upton
CF Ellsbury
DH Braun [or Kinsler or Victorino, tie]

RBI:
C Teixeira
1B Fielder
2B Cano
SS Tulo
3B Cabrera [or Braun, tie]
LF A-Gonzalez
CF Kemp
RF Granderson
DH Howard

If these were the teams, taking their defense [at perhaps sub-optimal positions] into consideration, now which do you like [feel free to move the players around as you see fit and swap in the ties if you’d like]

by erosen on Sep 8, 2011 2:54 PM EDT up reply actions  

Hi, I was just referred to your article, Rob. I wrote another rebuttal to this guy’s article earlier today:

http://www.pinstripealley.com/2011/9/6/2408743/re-some-perspective-on-war

Just to sum it up in case it’s too long for anyone, Hippeaux’s arguments show a real lack of understanding behind the framework of WAR. Most of the “problems” he has with WAR he is completely wrong about.

The Savior has come, and he is glorious. #63

by Wraithpk on Sep 6, 2011 6:48 PM EDT reply actions  

Real Problems with WAR

The biggest fundamental issue with WAR is that it cannot be decomposed into offense and defense. As a result, the far less accurate defensive metrics corrupt the good offense metrics. With Fangraphs adding in baserunning the situation is even more confused.

by BlueEyes_Austin on Sep 7, 2011 12:27 AM EDT reply actions  

What?

There is a batting component to WAR just as there is a defensive component, a replacement component, a positional adjustment component, and a baserunning component. These are all available on FanGraphs. And why you think baserunning shouldn’t be included in a metric that aims to reflect player value, I don’t know.

Save Jenrry Mejia!
Keep Reyes, Trade Wilpon.

by Ogre39666 on Sep 7, 2011 1:02 AM EDT up reply actions  

Yes there are separate components

They do not individually scale to zero, therefore they cannot be decomposed.

by BlueEyes_Austin on Sep 8, 2011 3:38 PM EDT up reply actions  

Also on baserunning

The issue is not whether in a Platonic sense it SHOULD be included but rather whether the accuracy of the measurement merits inclusion in a single index value.

by BlueEyes_Austin on Sep 8, 2011 3:39 PM EDT up reply actions  

ugh…

The Savior has come, and he is glorious. #63

by Wraithpk on Sep 7, 2011 3:17 AM EDT up reply actions  

In fact

Statcorner.com has offensive-only WAR. Really more people should use that site.

by Turner Wingo on Sep 7, 2011 2:23 PM EDT up reply actions  

Pot, meet kettle
When Hippeaux says WAR doesn’t work “in the fantastically straight-forward way we try to use it,” what he really means is that it doesn’t work in the way you — that is, you dunder-headed fools who aren’t as smart as Hippeaux — try to use it.

Can you say “straw man”?

Yes, I can. Here goes: Rob, that is a ridiculous straw man, because Hippeaux is not coming anywhere close to calling anyone a “dunder-headed fool.”

by tomemos on Sep 7, 2011 12:47 PM EDT reply actions  

Where is the apology?

Wow, did you read that article wrong Rob. As evidenced by todays clarifying post at IIATMS and your comment to his post, you completely missed the point. The post wasn’t anti-WAR at all, just sparked a debate as to its exact value and warned againts using it as the ultimate stat. But arguing against the post isn’t why you were wrong. Your comments were harsh and personal against the author, (i.e. intelecually bankrupt). Maybe the author didn’t want to bring up specific writers or analysts because he didn’t want to single out anyone. I think you owe him an apology for your personal attack.

by 27up-27down on Sep 7, 2011 2:54 PM EDT reply actions  

Ummm

There’s a distinction between saying someone is intellectually bankrupt and saying someone’s argument is intellectually bankrupt.

Acknowledging this distinction is an important marker of intellectual sophistication.

by Rob Neyer on Sep 7, 2011 6:25 PM EDT up reply actions  

is that a no

So i guess that means you didn’t apoligize?

If you want to make the distiction between calling the argument intellectually bankrupt instead of the writer himself, go ahead. But it certainly seemed that you were taking his argument personally when his use of the “Royal We” wasn’t used to mask an attack on individual sabermetricians or even those who champion WAR. Puting words in his mouth like:

“that is, you dunder-headed fools who aren’t as smart as Hippeaux”

doesn’t really add to the debate does it? This would have been a great opportunity for a grand debate over WAR’s real value and evolution. Instead the debate is somewhat scewed by defensive reaction after defensive reaction. You are both very intelligent writers, why not argue the stat instead of gramtical distictions?

by 27up-27down on Sep 8, 2011 3:22 PM EDT reply actions  

scewed? gramtical? distictions?

Arguments are more compelling when they’re well-reasoned and well-spelled.

For next time.

by Rob Neyer on Sep 9, 2011 4:32 AM EDT up reply actions  

Gee, thanks!

Thank you Rob, for pointing out my spelling errors. That certainly goes a long way to furthur this debate.

by 27up-27down on Sep 9, 2011 1:02 PM EDT up reply actions  

An idea

Hey Rob…ok, ok i concede. There’s a [slight] difference…between describing someone as “intellectually bankrupt” and describing their argument as intellectually bankrupt. And i’m sorry for the rambling mis-spelled (sp) arguments, however…

With all this attention and hullabaloo around this issue this week, why don’t you and Hippeaux (or others) get together and write a grand point-counterpoint treatise on “State of WAR”. Break down its evolution, the difference between the three major formulas, how to improve it. Something large, definative, and beyond this week’s back and forth. Many of us would love to see that.

by 27up-27down on Sep 9, 2011 1:36 PM EDT reply actions  

Comments For This Post Are Closed

Yahoo_full_count Yahoo_fantasy_baseball

Photo

Baseball On Par With Other Professional Sports In Dealing With Bad Umpires

LOS ANGELES, CA:  Mark Ellis #14 of the Los Angeles Dodgers gets help from Dee Gordon #9 after a collision at second base with Tyler Greene #27 of the St. Louis Cardinals during the seventh inning at Dodger Stadium in Los Angeles, California.  (Photo by Harry How/Getty Images)

Mark Ellis Injury: Dodgers 2B Has Emergency Leg Surgery

WASHINGTON, DC - MAY 20:  Stephen Strasburg #37 of the Washington Nationals celebrates with teammates after hitting his first career home run in the fourth inning against the Baltimore Orioles during interleague play at Nationals Park on May 20, 2012 in Washington, DC.  (Photo by Greg Fiume/Getty Images)

Stephen Strasburg Pulled Early With 'Arm Fatigue', Downplays Significance