You attended the gala celebration at its release. You watched people fight over it on Jerry Springer. You watched the two-hour Dateline special discussing its impact. You participated in focus groups that discussed its construct. But you're still thinking, "What does Bill James' new Runs Created (RC) formula actually do?" OK, maybe none of this is true. Maybe you didn't fork out the $79.95 to read
The new RC—a big improvement? While the RC method is one of the seminal developments in sabermetric
history, it soon became apparent that there was a problem with the original RC construct.
Bill James himself commented on its deficiencies in his work. Beyond the question of how much or little the new RC formula is an improvement over the old one, the work would have to be labeled a disappointment to many readers for the simple reason that Bill did not much explain his thought process behind the modifications, or for that matter even supply enough details so that others could easily replicate his work. He barely provided any justification for his changes themselves. Coming from the man who almost single-handedly broke the Elias monopoly on baseball information, it was disturbing to see him pitch his tent awfully close to those in the sabermetric community who fail to divulge the details of their methods or the thinking behind them in the name of propriety. There is a rumor that Bill James will be releasing an update to his work. Rather than just make wild guesses, however, we undertook an involved examination of the new RC methodology, and then spent a lot of time discussing among ourselves the possible reasons behind the changes. Of course, we didn't always agree. That happens when people are put in the position to guess what someone else is thinking.
Rather than just make wild guesses, however, we undertook an involved examination of the new RC methodology, and then spent a lot of time discussing among ourselves the possible reasons behind the changes. Of course, we didn't always agree. That happens when people are put in the position to guess what someone else is thinking. ## How is the new RC calculated?We'll run through the new RC calculations using Ken Griffey's 1998 statistics as an example. This is appropriate because Bill James used Ken Griffey's 1997 stats for the descriptions in both of the new stats books. The big difference between James’ description and this one is that we’ll fully explain the parts he glossed over. We'll also comment on the changes. ## Step 1 - Calculate the A, B and C FactorsFigured just about the same way as James' classic tech version of the formula. The only difference is that James changes some of the terms depending on the availability of the data and the time period of the season in question. For the current time period (1988 to present), the Historical Data Group (HDG) 24 formula is used. Calculate the A, B, and C terms as follows: A = (H+ BB + HP - GIDP - CS) B = [ TB + ((BB + HB - IBB)*.24) + (SB*.62) + ((SH + SF)*.5)-(SO*.03) ] C = (AB + BB + HB + SH + SF) For Ken Griffey Jr.: A = (180 + 76 + 7 - 14 - 5)=244 B = ( 387 + (( 76 + 7 - 11) * .24) + (20 * .62) + (( 0 + 4)*.5) + ( 121*-.03))=415.05 C = (633 + 76 + 7 + 0 + 4)=720 If things were still calculated the classic way, we'd simply calculate Griffey's RC by A*B / C. This would give us what I call Griffey's "base H24". Putting the numbers together would give us 140.66 RC.
## Step 2 - Calculate initial Runs Created (iRC) by inserting A, B, and C factors into theoretical team context and round off the result:iRC = [ ((A + (2.4*C)) * (B + (3*C)))/ (9*C) ] - .9 * C Griffey iRC = [ ((244+ (2.4*720)) * (415.05 + (3*720)))/ (9*720) ] - .9 * 720=135.64 or 136 iRC Two additional things that happen here is 1) iRC gets rounded into a whole number 2) any negative results for individual players get changed to 0.
## Step 3 - Calculate the adjustment for home runs with runners on baseThe first situational adjustment in the new RC method is made for how many home runs Griffey Jr. hits with runners on base. This is another adjustment that Bill James didn't fully explain. We need four bits of information for Griffey to do this adjustment: - Total AB = 633
- AB with runners on base = 303
- Total HR = 56
- HR with runners on base = 26
Then calculate how many HRs Griffey would be expected to hit with runners on base, proportional to his AB. Expected HRs = (303/633*56) = 26.81 Subtract Expected HR from actual HR with runners on base and round the result 26 - 26.81 = -.81 or -1 This leaves Griffey with an adjustment for home runs with runners on base (HR-ROB) of -1. ## Step 4 - Calculate the adjustment for batting average with runners in scoring positionThe information we need here is Griffey's regular batting average and his batting average with runners in scoring position. It’s suggested you carry out this calculation to an extra decimal place. Regular Batting = 180 hits / 633 at bats or .2844 Batting with runners in scoring position = 57 hits / 184 at bats or .3098 Subtract regular batting from batting with runners in scoring position, multiply the result times at bats with runners in scoring position, and round the result. (.3098-.2844)*184 = 4.674 or 5 This gives Griffey an adjustment for batting with runners in scoring position (AvgSP) or +5
## Step 5 - Calculate the Preliminary RC (PrelimRC)Add together Griffey's initial Runs Created total with the situational adjustments to get his preliminary RC: PrelimRC = iRC + HR-ROB + AvgSP PrelimRC = 136 - 1 + 5 = 140
## Step 6 - Calculate the team reconciliation factorAfter calculating PrelimRC for all players, sum all the The 1998 Mariners PrelimRCs add up to 892 PrelimRC. Divide the actual team runs (859) by the team PrelimRC (892) to calculate reconciliation factor (RF): 1998 Mariners RF = 859 / 892 = .963
## Step 7 - Multiply team reconciliation factor times individual player's PrelimRC and round off to get the final RC result:For Griffey, 140 PrelimRC * .963 = 134.82 or 135 Runs Created for 1998.
## Reconciliation or incorporating the error for teams into player values?## Does new theoretical team context do what it's supposed to?The team reconciliation process done in Steps 6 and 7 is the part of James' changes that we question the most. Here's what Bill James has to say about this part of the process:
Is Bill correct about the error rate of the formula? Well, the average absolute error for all teams from 1984-1987 is about 20 runs. The range of errors is a much broader however. The biggest error for the formula during that time period is the +74 error for the 1987 Cubs. Since the Cubs scored 720 runs that season and the formula projects 794 runs, the formula overestimates by 10.2%. Who does the team reconciliation affect the most? Andre Dawson. Dawson loses 10 runs to "reconciliation". Instead of being credited with 110 runs, "The Hawk" gets credit for only 100. Is this really fair? Did Andre's play generate less than the 110 runs the
formula estimates? We don't know for sure, so it's not fair to penalize him.
What's the point? Does subtracting the ten runs tell us something about Andre
Dawson? No. The fact that the formula overestimates tells us that his
To further illustrate, let's look at the other extreme from the same season. The largest negative error for the 1987 National League belonged to the Cardinals. The Cardinals scored 798 runs enroute to the 1987 World Series title, while the Cardinal players are estimated to have 755 RC, an error of -43 runs. To account for this discrepancy 43 runs must be added to the player totals. Which player benefits the most from this largesse? Jack Clark. Clark gets
credited with 124 runs instead of 117 runs. Again, is this fair? No. The fact
