James Fraser's Note: The following is an article by Voros McCracken. He introduces his new method for evaluating pitchers. I think its some of the more original and interesting sabermetrical work I've seen in a while. Enjoy and Happy Holidays!
By Voros McCracken
Consider the following two pitchers:
Pitcher W L ERA IP H HR TBB IB SO HP BFP Aaron Sele 18 9 4.79 205.0 244 21 70 3 186 12 920 Jose Rosado 10 14 3.85 208.0 197 24 72 1 141 5 882
Alright, who pitched better?
I bet you think you know what's coming, don't you? It's "anti-Won/Loss record argument number 3,178." You know the one. It goes like this:
"The sportswriters and Cy Young award voters place WAY too much emphasis
on Won/Loss records. So they vote for Aaron Sele when it's obvious Jose
Rosado was the better pitcher. Rosado's ERA is almost a full point lower
than Sele's. Rosado gave up 52 less baserunners in three less innings.
The sportswriters have yet to understand how meaningless Won/Loss records
are for a single season and that they rely too heavily on run support. Plus
the things it supposedly measures is measured better by ERA. Nevertheless
Sele will get more Cy Young votes anyway
I'm sure you've heard it before and I agree with a lot of the above. So if
you ask me who the better pitcher was, I would answer confidently...
...Aaron Sele. That's right, as far as I'm concerned, Sele was not only
better than Rosado but one of the top three pitchers in the American League.
And believe it or not, this assessment gives ZERO weight to his Won/Loss
record.
I reach this conclusion by using something I call Defense Independent
Pitching Stats ("DIPS"). DIPS use the same categories as regular stats, but
adjust them to a neutral context for park, league and defense. What the
method does is use only those elements of the pitcher's record that can not
be affected by the defense behind him. Those stats would be BFP (as the
playing time standard), HR, HB, TBB, IB and SO (I realize an outfielder can
occassionally save a home run, but the rarity of that event makes it
virtually meaningless for our purposes). That's it, we'll only use those
stats. Stats which will have NO effect on our final analysis will be W,
L, ERA, IP, ER and, most controversially, H. That's right, we will not use the
pitcher's hits allowed total AT ALL.
I suppose I'll pause here in my explanation of DIPS because I've probably got
quite a few disbelievers by now. Since this is essentially a "one-way"
discussion I'll go ahead and make your arguments for you (as I'm pretty sure
I'll hit on at least one of your objections.) One argument is, "I think you
over-estimate the amount of control the defense has over balls in the field
of play. A majority of such plays are routine and therefore should be
credited to the pitcher." There is, "Removing hits gives too much credit to
strikeout pitchers and not enough credit to the guys who get lots of one and
two pitch outs." And also, "Don't you think you should just make a general
adjustment based on the overall team defense?"
I've heard these a lot and have done a lot of work to try and explain why I
don't agree with these arguments. The idea of NOT using a pitchers hit totals
to evaluate his performance is not well supported in any community including
the Sabermetric one. One of the more talked about recent stats is Bill James
Component ERA which relies heavily on hits allowed totals. But what I'm going
to detail here is why I think it's important to remove Hits Allowed from our
evaluations of pitchers. I'll warn you that this will be relatively lengthy
and involve some statistics, but if you're at all interested in statistical
evaluations of pitchers, I think it's important to understand some things
about the various pitching stats.
What I'm going to do is a comparison of pitching statistics from 1998 to
1999 for the group of pitchers who pitched 162+ innings in both seasons
(there were 60 such pitchers). I'll start off by defining some rate stats.
The only statistics used will be IP, H, HR, BB and SO. The raw figures I
have don't include BFP but for our purposes here we don't really need it. We
can simply estimate BFP by the formula ((IP*3)+BB+H). Later on, when we do
Aaron Sele's and Jose Rosado's DIPS, we'll use a bit more complicated
formulae, which use the actual BFP totals along with Hit Batsmen and
Intentional Walks. rate stats are only for the purposes of establishing
the year to year correlation of the basic elements of a pitcher's record.
$BB=BB/((IP*3)+H+BB); This rate stat is essentially measuring how often the
pitcher walked the guy during the season per number of batters he faced.
Obviously the second half doesn't equate accurately with the pitchers actual
BFP, but I'm sure it equates pretty precisely with it, i.e. the difference
between BFP and that part of the equation is most likely pretty constant in
the long run from pitcher to pitcher.
$SO=SO/((IP*3)+H); This stat is designed to measure how often he struck guys
out relative to how often he gave up fair batted balls. A measure of how
difficult it is to make contact in other words.
$HR=HR/((IP*3)+H-SO); This stat measures how often a batted fair ball left
the park against the pitcher.
$H=(H-HR)/((IP*3)+H-SO-HR); This stat measures how often a batted ball in the
field of play not leaving the park falls in for a hit. This stat is the
central focus of the discussion and its behavior is the basis for my decision
for leaving hits allowed totals out of the evaluation method.
(these are difficult stats to write out every time, so to understand the
following you might need to refer back to what the $BB, $SO, $HR and $H
abbreviations mean).
What I did was compare each rate stat for each pitcher in 1998 with what that
same pitcher did in 1999. For example, Andy Benes $BB in 1998 was .075 and
his $BB in 1999 was .092 (note these numbers will not be park or league
adjusted for simplicity's sake). These figures were used in a linear
regression analysis where the correlations of the stats from 1998 to 1999
will be computed and compared to each other (a measure of the stat's
consistency). The correlations when each stat is compared to it's
counterpart (i.e. 1998 $BB to 1999 $BB and so on) are as follows:
$BB=.681
If you know about statistics and you know about baseball, this should have
your attention. The higher the figure the higher the correlation, so in three
of the stats, the correlation ranges from ok ($HR) to very good ($SO). In the
other stat ($H) there really is a very low level of correlation. Considering
many of these pitchers had the same defenses and pitched in the same parks,
one could argue that any correlation there is might be due to those factors
as much as anything.
What does this mean? Essentially it means that if a pitcher posts a very low
$H rate one year, you really can't expect him to repeat that with any level
of certainty at all. However if a player posts a very high $SO rate, there is
a level of comfort in thinking he'll have a good one the following year as
well. I cannot stress how important I think this is. Think about a second.
How much value would you give to a stat, if you KNEW that it meant virtually
nothing towards that players future stats? IOW, we don't give Aaron Sele
credit for the seven runs a game his team scored for him because we know that
he had little to do with it. Those runs are very real and very valuable but
there isn't a real reason to give Sele credit for them. In my opinion, Hits
Allowed deserves the same treatment.
To leave the realm of linear regressions and show examples of what I'm
talking about, I'll provide the following:
10 Lowest $H in 1998 and 1999 (of the 60 pitchers)
(In order)
You may have noticed the asterisks next to Maddux's and Millwood's names on
these lists. I want you to remember that they were on these lists.
Notice that only one pitcher, Pete Harnisch, made the top 10 both years. If
this stat really had anything to do with the pitchers actual ABILITY, one
would expect a few more guys to be on the list both years.
10 Highest $H in 1998 and 1999
(in order)
That's interesting. In 1999 Kevin Millwood had the lowest $H rate of any
pitcher in the majors while in 1998 he had the 8th highest. If $H reflected
pitching abilities, would that make sense? Does Mark mcGwire ever finish
among the lowest in the league in HR%? Does Rey Ordonez ever finish among the
highest? And in Back to Back years !?!? We'll move on.
1999: Aaron Sele, LaTroy Hawkins, Jon Lieber, Greg Maddux*, Pedro Martinez,
Shane Reynolds, Pedro Astacio, Steve Woodard, Livan Hernandez and Charles
Nagy.
And there's Millwood's teammate, Maddux, pulling the opposite trick. After
posting the 5th lowest in 1998, Maddux then proceeded to post the 4th
highest. In other words, there was one instance of a pitcher making the top
10 both years, and two instances of pitchers making the top 10 one year and
the bottom 10 the other. There is an increase in players being on both lists
here though as Aaron Sele (the star of our story), Shane Reynolds and Pedro
Astacio (Coors effect mostly I assume. Kile the only other Rockies pitcher
who qualified made the 1998 list too). However I bet the name Pedro Martinez
jumped out at you (for 1999 no less!) Pedro was rightfully considered
unhittable this year, but in fact when they did hit Pedro, a large number of
balls went for hits this year. Did you expect that?
In the other categories things make more sense. In $BB, seven pitchers were
on the lists for lowest $BB both years. Five Pitchers were on the lists for
highest $BB both years. No pitchers were on one list one year and the other
list the other.
For $SO, five pitchers were on the lists for highest $SO both years. six
pitchers were on the lists for lowest $SO both years. Again, no pitcher was
on one list one year and the other list the other.
For $HR, four pitchers were on the lists for lowest $HR both years. Four
pitchers were on the list for highest $HR both years. There was one pitcher
who was among the lowest in 1998 (10th lowest) and the highest in 1999 (5th
highest). It was the enigmatic Chan Ho Park.
Again, this shows us that for at least 2 of the stats and to a certain extent
the third, the guys that do well one year in a stat tend to do well the next
year too. In the $H stat though, such a conclusion is not possible.
This borders on heresy, really. We thought we were finished with Strikeouts
once it was shown that their direct value (relative to other outs) was very
minimal. But it appears their indirect value for pitchers is large due to the
instability of the hits allowed statistic (i.e. if you strike the guy out,
you control your own destiny as a pitcher). Quite simply, I can't look at
the above and think that a pitcher's Hit total is more important than his
strikeout total. I just can't do it. I've looked at historical patterns
regarding these numbers back to 1946, and the same correlations keep popping
up. High and very high for $BB and $SO respectively. Mid range for $HR and
low for $H. I can pass along exact figures to whomever is interested.
These correlations are the basis of my DIPS work. More accurate forms of $BB,
$SO and $HR are used to compile the numbers (as will be explained in a bit), but I simply didn't see a good
reason to bring $H into this. Yes the hits given up were costly and can lead
to runs, but I've yet to see much information that suggests the pitcher has a
whole lot of control in giving up the hits or preventing the hits, OTHER THAN
PREVENTING BATTERS FROM HITTING THE BASEBALL. Getting hits off Randy Johnson
or Pedro Martinez is tough, not because their pitches are tough to center
(remember both pitchers made highest $H lists) but because they strike you
out so often.
So lets get back to the DIPS methods and Aaron Sele and Jose Rosado. To get
our DIPS we do the following (The rate stats defined here will be similar to the ones above, but these will
be a little more complex and use more data) :
We add BFP and HP to the pitcher's records, unchanged. (If we were comparing
pitchers across leagues, we'd make an adjustment so that their leagues would
be "equal." Obviously a pitcher in the AL is at a disadvantage having to
face a DH instead of other pitchers. We would adjust for this normally, but
since both pitcher's are American Leaguers, we don't have to.)
Sele: (70-3)/(920-12-3)=.07403
Rosado: (72-1)/(882-5-1)=.08105
Now we take that figure and adjust for the park's influence on walks:
Sele: .07403*.995=.07366
Rosado: .08105*1.007=.08162
Now we can argue over Park Factors from here until doomsday. I'm using Park
Factors provided by Tom Fontaine at http://www.stathead.com, but whatever you
come up with for factors will work here. Anyway the figures above are then
multiplied by (BFP-HP-IBB) (remember we would also adjust for leagues if
necessary) and then multiplied by a league average (TBB/(TBB-IBB))
rate (1.0544):
Sele: .07366*(920-12-3)*1.0544= 70 = TBB
Rosado: .08105*(882-5-1)*1.0544= 75 = TBB
Which so far gives us DIPS of:
Now we move on to strikeouts. You'll notice as we move on, our denominators
will shrink slightly each time as figures we've already computed are siphoned
off. The idea is to keep each rate as close to a real percentage as possible.
It is necessary to do this to maintain the interdependence of all of these
stats (i.e. a pitcher that walks half of the guys he faces won't be able to
post very high numbers in the other areas once adjusted).
For strikeouts we simply take the pitcher's strikeout total and divide it
from BFP-HP-TBB:
Sele: 186/(920-12-70)=.22196
Rosado: 141/(882-5-72)=.17516
Adjust for park:
Sele: .22196*1.0384=.23048
Rosado: .17516*1.0466=.18332
And now we multiply by the DIPS (BFP-HP-TBB) total (remember interdependency
is the key. If DIPS adjust something to a large extent upward or downward, it
will affect the later adjustments):
Sele: .23048*(920-12-70)= 193 = SO
Rosado: .18332*(882-5-75)= 147 = SO
And our DIPS are now:
Next we move on to Home Runs. We take the pitcher's HR total and divide by
(BFP-HP-TBB-SO):
Sele: 21/(920-12-70-186)=.03221
Rosado: 24/(882-5-72-141)=.03614
The park factors:
Sele: .03221*.9867=.03178
Rosado: .03614*1.0384=.03753
And then we multiply those by our DIPS (BFP-HP-TBB-SO):
Sele: .03178*(920-12-70-193)= 20 = HR
Rosado: .03753*(882-5-75-147)= 25 = HR
And now we have our DIPS:
Okay we are now done with Sele's and Rosado's raw stats. The rest will be
done using a series of "league averages." The stats above represent all the
stats from a pitchers record that are not affected by defense (Balks aren't
counted but they are rare events, e.g. neither Sele nor Rosado had one called
against him).
We now use the same denominator for much of the rest of what we do. It will
be the new DIPS version of (BFP-HP-TBB-SO-HR). These represent all the BFPs
where the defense had some hand in the outcome. Let's recreate the Hits
statistic and we'll see where the system dissents from the Raw stats. We take
the AL average of (H-HR)/(BFP-HP-TBB-SO-HR), which is .3008. We then multiply
that number by the above denominator (at this point both our park and league
factors no longer apply since every pitcher will now be assigned the same
"league average" for each rate):
Sele: (920-12-70-193-20)*.3008=188
Rosado: (882-5-75-147-25)*.3008=190
And then we add our DIPS HR stat to get our much anticipated DIPS H stat:
Sele: 188+20=208
Rosado: 190+25=215
And our DIPS are now:
Okay. Now we'll stop and notice the MAJOR change we have now. Rosado has now
gained 18 hits by this method and Sele has lost 36! What's going on is
simple. Aaron Sele pitched in a hitters park for a team who played Todd
Zeile at 3B, Mark McLemore at 2B, Tom Goodwin in CF and Juan Gonzalez in RF,
all substandard defensive players most at relatively important defensive
positions. Rosado on the other hand played for a team with three CFs in the
outfield, Rey Sanchez (who is ALL glove) at SS and good defenders at 2B and
3B. This was a team that left Jeremy Giambi in AAA for a month over concerns
about his defense at 1B. The Royals decided the key to their team was defense
and they played very good defense (side note: those convinced that defense
wins championships might want to check out the Royals 1999 record). Add in
the fact that there's always a lot of random noise involving hits allowed
stats anyway and Rosado's large advantage in the hits allowed department
doesn't seem to have much to do with his pitching abilities to me.
Let's move on. Now we'll nail down our IP numbers (remember, now that Sele's
given up less hits, he has gotten more outs so his IP will increase). We use
our denominator above again (BFP-HP-TBB-SO-HR). We'll multiply that by the
following AL "league average" (for non SO outs):
((IP*3)-SO)/(BFP-HP-TBB-SO-HR), which yields for 1999 in the AL, .7363. This
brings us:
Sele: (920-12-70-193-20)*.7363=460
Rosado: (882-5-75-147-25)*.7363=464
Which is a non-SO out total for the pitcher. We add the DIPS SO total and
divide by three to get our DIPS IP total.
Sele: (460+193)/3= 217.2 = IP
Rosado: (464+147)/3= 203.2 = IP
Our DIPS are starting to round out:
(1B*.50)+(2B*.72)+(3B*1.04)+(HR*1.44)+((TBB+HP)*.33)-((BFP-H-TBB-HP)*.098)
Which we get for the 1999 AL of: 11689.73
Now we divide the leagues ER total by that figure and we can now multiply
that factor by the XR total we get for the pitcher. That total will be our
DIPS ER figure:
10832/11689.73=.9297
ER=.9297*(Whatever we come up with for the pitcher's XR)
I'm sure you noticed the 2B and 3B totals above for XR. Getting those for our
pitchers will be easy. We'll simply multiply the league average of:
(2B/(H-HR))=.21743
(3B/(H-HR))=.02225
to our DIPS (H-HR) totals:
Sele: (207-19)*.21743= 41 = 2B
Rosado: (215-25)*.21743= 41 = 2B
And
Sele: (207-19)*.02225= 4 = 3B
Rosado: (215-25)*.02225= 4 = 3B
We now have enough to come up with our XR figures for each pitcher:
Sele: ((208-41-4-20)*.5)+(41*.72)+(4*1.04)+(20*1.44)+((70+12)*.33)
-((920-208-70-12)*.098)= 99.3 = XR
Rosado: ((215-41-4-25)*.5)+(41*.72)+(4*1.04)+(25*1.44)+((75+5)*.33)
-((882-215-75-5)*.098)= 111.054 = XR
And now we get our DIPS ER totals:
Sele: 97.762*.9297 = 92 = ER
Rosado: 111.054*.9297 = 103 = ER
Let's make an ERA!
And while we're at it, let's exhume Pythagoras. We'll use the theory of:
(R^1.83)/((R^1.83)+(OR^1.83))=WIN%
Substituting the league ERA (4.86) for R and the Pitcher's new DIPS ERA for
OR, we get:
Sele: (4.86^1.83)/((4.86^1.83)+(3.80^1.83))= .611 = WIN%
Rosado: (4.86^1.83)/((4.86^1.83)+(4.55^1.83))= .530 = WIN%
In the AL there were 2263 decisions and 20076.2 IP so:
2263/(20076+(2/3))=.11272
So:
Decisions = .11272 * IP
Sele = 217.2*.11272 = 25 = Decisions
Rosado = 203.2*.11272 = 23 = Decisions
And:
Sele = 25 * .611 = 15 = wins and (25-15)=10=losses
Rosado = 23 * .530 = 12 = wins and (23-12)=11=losses
Now we add our final touches and rearrange:
And there we have it. Those are Defense Independent Pitching Stats and their
results say that Aaron Sele was a far better pitcher than Jose Rosado in 1999.
Easy, wasn't it? :)
So what have we done? We've taken individual pitcher stats and we've used
only the ones that are not affected by defense and have a definite
relationship to pitching ability. Hits allowed is not one of these
statistics and so we don't use it. ERA is another and so we don't use it
either (as our new method with a few minor adjustments will correlate with
ERA the following year much better than ERA itself as you'll see in a future
article). Instead we use stats like BB, HR and SO (the most important of the
pitching stats) and league averages for the others. The method can affect
our evaluations of pitchers by a LARGE MARGIN (as you saw above). The method
adjusts for park and league and most importantly, THE QUALITY OF DEFENSE THAT
WAS PLAYED BEHIND HIM IS COMPLETELY REMOVED FROM THE EQUATION.
A listing of all the 1999 DIPS for every pitcher can be found at:
http://www.enteract.com/~mccracke/dips
(note: the DIPS on the page for Rosado and Sele will be slightly different as
the method should maintain it's decimals, but for simplicity's sake, I
rounded my numbers in the examples above.)
The possibilities for application of this method are pretty huge. Armed with
DIPS, we might now be able to obtain the long sought after Minor League
Equivalencies for pitchers that we now have for hitters. We could apply them
on a team by team basis to see, for example, what the maximum effects the
team's fielding could have had on run scoring. We can see how DIPS are a
better indicator of future ERA (and as such pitching ability) than ERA and
even Bill James' Component ERA. We'll also see that as our pitcher's sample
sizes reduce, DIPS advantage over the other stats grows to a very large
margin (The last two sentences will be addressed in a future article).
Finally and most importantly, we'll be armed with the knowledge that pitchers
don't have as much control over certain things as we've previously given them
credit for. We'll be able to recognize true exceptional pitching performances
and other more deceptive performances, just by knowing what's important and
what isn't. In short, we will come to the understanding that a HUGE amount of
what we used to think was "Pitching" is actually "Defense."
Comments to Voros McCracken at voros@daruma.co.jp
BaseballStuff.com | The Baseball Scholars |
Baseball Scholars Forum
$SO=.792
$HR=.505
$H =.153
1998: Hideki Irabu, Pete Harnisch, Woody Williams, Kenny Rogers, Greg Maddux*,
David Wells, Dustin Hermanson, Brian Moehler, Al Leiter, Tom Glavine.
1999: Kevin Millwood*, Omar Daal, Masato Yoshii, Curt Schilling, Pete Harnisch,
Bartolo Colon, David Cone, Rick Helling, Eric Milton, Kevin Brown.
1998: Aaron Sele, Shane Reynolds, Brian Meadows, Scott Erickson, Pedro
Astacio, Randy Johnson, Mike Sirotka, Kevin Millwood*, Brad Radke and Darryl
Kile.
Pitcher BFP HP
Sele 920 12
Rosado 882 5
Now we start off by adjusting the Walk totals. First off we subtract the
Intentional Walk total from his total walks and divide that figure by the
pitcher's (BFP-HP-IBB) total:
Pitcher BFP HP TBB
Sele 920 12 70
Rosado 882 5 75
Pitcher BFP HP TBB SO
Sele 920 12 70 193
Rosado 882 5 75 147
Pitcher BFP HP TBB SO HR
Sele 920 12 70 193 20
Rosado 882 5 75 147 25
Pitcher BFP HP TBB SO HR H
Sele 920 12 70 193 20 208
Rosado 882 5 75 147 25 215
Pitcher BFP HP TBB SO HR H IP
Sele 920 12 70 193 20 207 217.2
Rosado 882 5 75 147 25 215 203.2
We're moving now. Now we need to derive the all important ER stat. I've
decided to use Jim Furtado's Extrapolated Runs formula ("XR") as the basis
for determining this. First thing we'll do is figure out how to
"league average" what we need. First we determine the league's XR (we'll
leave off SB and CS totals here, as they are negligible and one could argue
that they're not really "pitching"). To figure XR we do the following:
Pitcher BFP HP TBB SO HR H IP ER
Sele 920 12 70 193 19 207 217.2 92
Rosado 882 5 75 147 25 215 203.2 103
Pitcher BFP HP TBB SO HR H IP ER ERA
Sele 920 12 70 193 19 207 217.2 92 3.80
Rosado 882 5 75 147 25 215 203.2 103 4.55
Pitcher W L ERA IP H HR TBB SO
Sele 15 10 3.80 217.2 208 20 70 193
Rosado 12 11 4.55 203.2 215 25 75 147
Comments to James Fraser