The following appeared in The 1985 Bill James Baseball Abstract.

PAUL JOHNSON'S ESTIMATED RUNS PRODUCED

— Bill James

Last fall I received a letter from an Illinois man named Paul Johnson who claimed that he had developed a very simple method of evaluating run production that was more accurate than runs created. I receive a couple of those letters a year, and it rarely takes me five minutes to get them off my desk. Peter Palmer in The Hidden Game makes a similar claim for the linear weights method, and Pete is a good friend and an outstanding analyst of the game, but in fact linear weights do not meet any acceptable standard of accuracy in assessing an offense . . . well, I went through all that in the historical book.

I spent a long time developing the runs created formula, and I've spent many hours looking for ways to improve it. My assumption has been that runs created could be improved, but that when improvements came, they would have to come from dredging through the minor offensive stats and extending the decimal points. But Johnson seemed reasonably intelligent, and I thought I owed him the courtesy of at least checking out his system.

And damned if it doesn't. I was astonished. I'm not certain that Paul's "Estimated Runs Produced" method is more accurate than runs created—we'll get into that at the conclusion of the piece—but I am convinced that it is an extraordinarily good method. It's accurate, it's simple, and it measures what we need to measure: runs. It is more accurate than runs created for certain types of players. On that basis, I felt I should ask Mr. Johnson to introduce his method to you.

ESTIMATED RUNS PRODUCED

— Paul Johnson

I've come up with a method for estimating run production which is more accurate than even Bill James' runs created formula. Hard to believe? Well, it's even harder for me to believe that Bill invited me to say so in print and in his book! That speaks volumes about his dedication to furthering the baseball knowledge of his readers. When he finds a new, more accurate system he lets you know about it. Even if it isn't his. With that kind of openness he's going to make baseball statistical analysis a science yet. Nominate that man for the Nobel Prize in Baseball. He'd get my vote. Thanks for the chance to introduce my system, Bill.

The estimated runs produced formula I've developed has the same goal as the runs created formula. Both are designed to calculate the number of runs that individual players produce for their teams. For most average and less-than-average players and teams the two formulas yield very similar results. But when you get into high home run production or outstanding slugging percentages, especially when combined with high on-base percentages, then the two formulas go their own and definitely separate ways.

Checking against the actual statistical records of Major League Baseball shows my estimated runs produced formula to be the more accurate:

1970-1984 MAJOR LEAGUE BASEBALL
	Runs/Team	Estimated Runs Produced	Runs Created
Top 10 Teams in Home Runs (avg. 200 HR)	820	821	832
TOP 10 in Slugging Pct. (avg. .448)	836	838	851

Estimated runs produced is consistently more accurate. The '84 Detroit Tigers led the Majors with 187 home runs. They scored 829 runs, had 830 estimated runs produced, but 843 runs created. Boston led the Majors in slugging percentage in '84 at .441. The Red Sox scored 810 runs. had 816 estimated runs produced, but 832 runs created.

The differences may not seem that great, and they really aren't at these levels of performance. But even though these represent the top team performances since 1970, these team statistics are not very far removed from average levels of play. Individual players rise far beyond these levels. An average team might hit 120 home runs. A team of Mike Schmidts would smack 350. An average team might have a slugging percentage of .390 or less. A team of Jim Rices would slug over .530. At the statistical levels produced by Schmidt or Rice the differences between the two formulas become magnified, and they become very significant. Some illustrations:

In all the World Series and League Championship series games ever played in which a team had a .650 or better slugging percentage the teams combined for a .38l batting average and slugged .709. They scored 276 runs, had 279 estimated runs produced, but 331 runs created. A 19% difference in the formulas.

The highest post-season slugging percentage in a single game was recorded by the Cubs in Game One of the '84 National League Playoff. They crashed five home runs out of the friendly confines of Wrigley Field and rolled up an .895 slugging percentage. They scored 13 times, had 13 estimated runs produced, but 17 runs created. This time the formulas differ by 30%.

At the highest levels of statistical performance runs created overestimates run scoring; it seems to be biased. Would that carry over from team statistics to an individual player's figures? It can't be directly proven, but here's a stab at demonstrating that it would.

A team of Babe Ruths would hit three home runs per game and could be expected to hit at least .300. I sifted through every World Series game looking for teams which had a game with those characteristics. I found 14. And their statistics tallied up to look remarkably like a Ruthian season 1929 to be exact.

	AB	H	TB	2B	3B	HR	BB	SB	Avg.	Slg.
14 Series Games	509	180	347	27	1	46	65	3	.354	.682
1929 Babe Ruth	499	172	348	26	6	46	72	5	.345	.697

125 runs were scored in the Series games. There were 129 estimated runs produced and 148 runs created. Babe's 1929 estimated runs produced total was 131, his runs created, 149. Both formula figures were nearly identical for the Series games and for Ruth. The estimated runs produced formulas gave more realistic results for the World Series games, games which were a virtual mirror image of Ruth's 1929 stats. That doesn't prove that estimated runs produced gives more realistic results for Ruth. But it is highly suggestive.

So how do the formulas compare in rating some of today's top performances? Well, they both agree that Dwight Evans had the best total production in the American League in '84, 125 estimated runs produced, and 132 runs created. The formulas disagreed in the National League. The estimated runs produced formula gave the nod to Dale Murphy, 120 to 117 over Ryne Sandberg. Sandberg edged Murphy, 126 to 123 in runs created. Why the difference in the NL? Two major factors. The first is that the runs created formula puts more emphasis on being caught stealing and grounding into double plays than does the estimated runs produced formula. Murphy had 6 more of those plays than did Sandberg. The second factor is that Murphy had 20 intentional walks compared with only 3 for Sandberg, and the runs created formula puts a lesser value on intentional walks than on a normal base on balls. The estimated runs produced formula puts equal value on each.

Some will undoubtedly point to that last fact as a flaw in the estimated runs produced equation. Perhaps it is. But I'm not convinced of that. Though it's true that the intentional walk won't advance any base runners it does improve the chances of baserunners advancing on a subsequent walk. I saw the following happen at least three times this year: Runner on second. Batter intentionally passed. Next batter walks, advancing the baserunners one base each, or a total of two bases. So because of the intentional walk, the next walk that occurred advanced men two bases instead of advancing them no bases as would've happened had there been no intentional pass. It seems to tend to even out.

But back to how the formulas rate the players. Again, the major differences in the ratings occur when a team or player combines a high on-base percentage with a high slugging average. And I should say a high net on-base percentage. What I mean by that is the chance of being on base minus the chance of being thrown off the bases by being caught stealing or of wiping someone else off the base paths by grounding into a double play. The net on-base percentage is, of course, essentially the first part of the runs created formula.

When rating a player such as Tony Armas, low net on base percentage, but high slugging percentage, the estimated runs produced formula gives a higher rating. 100 vs. 97 runs created. A player such as Eddie Murray, highest on base percentage in the AL and a high slugging percentage, fares much better using the runs created method, 130 vs. only 120 estimated runs produced. The estimated runs produced method rates Armas 20 runs behind Murray, 120 to 100. The runs created formula rates Armas 33 runs behind Murray, 130 to 97. That's a big difference. Just about a game and a half in the standings.

So which method do you believe? Well, I've certainly got my preference. Although I might be slightly biased I'll take estimated runs produced every time. If you're a runs created fan you probably won't go too far wrong most of the time. But be a little wary of the runs created totals for players who hit over .300, those whose batting average added to slugging percentage exceeds about .700, those with many or very few walks, and for those who steal a mean base. When any of those conditions are in evidence, break out the estimated runs produced formula.

Speaking of which, this might be a good time to get a look at this as yet unseen creation. Here is the estimated runs produced formula:

(2 x (TB + BB + HP) + H + SB - (.605 x (AB + CS + GIDP - H))) x .16 = Runs

To get some understanding of the formula it's easiest to at first ignore the numbers in it. Then the formula can be viewed as two simple sides. The left-hand side consists of Total Bases, Bases on Balls, Hit by Pitcher, Hits, and Stolen Bases. It can be thought of as a collection of the positive contributions a batter or baserunner makes to his team. On the right-hand side are the negatives, the outs made by the batter or baserunner. The outs are figured by adding At Bats, Caught Stealing, and Grounded Into Double Plays, then subtracting the times the batter reached base safely on Hits.

So the formula is really quite simple. The left-hand side tracks the movement of batters and baserunners, the right-hand side keeps tabs on the number of outs made. The numbers exist only to put proper emphasis on the various events. They are essential to making the equation work, but there's no need for me to go into how they came to be what they are. I'll just tell you that it took a hell of a lot of experimenting to settle on the darned things.

I will explain just a bit about the origins of the left side of the formula. It's based on charts I made of the number of bases advanced by batters and base runners on various offensive plays. Among other things, two bits of information I got out of those charts were that home runs moved batters and baserunners three times as many bases as did the typical single, and that a base on balls advanced the batter and baserunners only two-thirds as many bases as did a single. Those two facts led to the design of the left-hand side. I just needed a simple way of saying that a home run was three times as good as a single and that a walk was only two-thirds as good as a single. Well, 2 x (TB + BB + HP) + H + SB does it. A single equals 3; 2 x (1 + 0 + 0) + 1 + 0. A home run equals three times that, 9; 2 x (4 + 0 + 0) + 1 + 0. And a walk equals two-thirds of the single, 2; 2 x (0 + 1 + 0) + 0 + 0. The relative values of doubles, triples, and stolen bases are similarly determined and reflect the actual values I found in my charting of movement around the bases.

That's enough on the inner workings of my estimated runs produced formula. But before I go on I just want to mention why the runs created formula seems to overvalue high slugging percentage performances. Remember that the charts I did showed that a typical home run is three times as valuable to scoring as is a typical single. But the runs created formula credits the home run with being four times as valuable as the single:

         Home Run                       1 hit x 4 total bases / 1 at bat = 4 
	 Single 		        1 hit x 1 total base / 1 at bat = 1

Too much emphasis is put on the home run. Doubles and triples are similarly overstated. This doesn't create a big problem in fairly accurately assessing run production for most players, however, as they have a pretty normal distribution of extra base hits versus singles. So the two formulas come up with similar results for a majority of players. But not so similar for players with high on-base and slugging percentages. Some more examples from the 1984 season Don Mattingly had a .381 on-base percentage, slugged .537, had 111 estimated runs produced vs. 121 runs created. Dave Winfield (.393 .515) had 104 estimated runs produced vs. 113 runs created. Keith Hernandez (.409, .449), 100 vs. 108. Mike Easler (.376, .516), 108 vs. 118. Tim Raines (.393, .437), 115 vs. 124.

At times the two systems produce even greater disagreement. A classic example is George Brett s 1980 season. Brett finished the year at .390 and he hit with power. His .664 slugging mark had been bettered only twice in the previous twenty years, by Hank Aaron's .669 in 1971, and by Mickey Mantle in 1961 when he cracked 54 home runs and slugged .687. Anyway, here's how the two methods measured Brett's year. George had 116 estimated runs produced, 135 runs created. 19 runs; that's some difference. About two wins in the standings. That is a rare case. But one win differences are very common. Add a couple or three of them together and you might wind up thinking that your third-place talent could win the pennant this year. Or that your pennant contenders will end up in third place. It just depends on which formula you use.

The two formulas also differ somewhat on the value of the running game. The estimated runs produced formula finds stolen bases to be more valuable and caught stealing to be less critical than does the runs created method. For the Top 10 teams in steals from 1970 to 1984 the estimated runs produced formula comes closer to the actual average of 684 runs scored per team, beating out runs created 681 to 677. And it was more accurate for the most prolific base-stealing team in recent history, the '76 Oakland A's (341 stolen bases). They scored 686 runs had 676 estimated runs produced, but only 663 runs created.

The two methods, not surprisingly, rate Rickey Henderson's 130-theft 1982 season a bit differently; 104 estimated runs produced, only 99 runs created. Not a profound difference, but significant. Especially when juxtaposed with their ratings of Robin Young (.331 29 HR) in that same year; 128 vs. 136 runs created. Again, there is a 13 run difference in the two formulae' comparisons of players. It you're a bit confused about which equation to believe, don't worry tool much. Using either one will prevent you from looking as inept as Major League Executives sometimes have.

Here's one of the all-time classics as viewed by the estimated runs produced formula. After the '79 season two veteran National League second basemen declared themselves free agents. Over the prior two seasons one had hit .241. the other .243. Over the next two seasons they batted .242 each. They sound like mirror images don't they? But they sure weren't. The first player was Rennie Stennett. In '78-79 he produced at a rate of 46 estimated runs produced per year. In '80-81 he continued at that rate. The other player did somewhat better. In '78-79 he produced at a rate of 95 estimated runs produced per year. He continued that production in '80-81. I wonder if the San Francisco Giants would've liked the estimated runs produced figures for '78-79 before they passed up Joe Morgan after the '79 season in favor of signing Stennett to a $2.5 million deal? It's a rare case where two players will produce so consistently from year to year, have such similar batting averages, and yet be light years apart in run-producing capabilities. But it did happen in real life. And somebody really blew it.

Of course every tool has its limits, and the estimated runs produced formula is no exception. I was trying to come up with the break-even stolen base percentages for various teams, but I ran into a big stumbling block: Timing. It makes a big difference whether a steal is being attempted with no outs or with two outs. It's a lot easier to justify the attempt with two outs, as doing it then isn't nearly as likely to break up a big inning if you don't succeed as if you tried with no outs. But the estimated runs produced formula can't really be used to calculate just how much more risk there is in trying the no-out steal. At least I haven't figured out how to do it.

I can make some very broad generalizations about break-even percentages for stolen bases. A team that hits for a normal batting average, hits with average power, is average in every other statistic, and steals an average number of times with none, one, and two outs (who knows what that average distribution is?) needs to succeed on about 61 % of its steal attempts to make it worthwhile. A slugging team such as the '84 Tigers would have a breakeven point of about 65%. A low-power bunch like the '84 Cardinals could justify attempting to steal with a 60% success rate. That's talking about teams, but it would be more appropriate to consider the individual batter at the plate.

With a weak-hitting shortstop at bat, a 55% success rate would be about the break-even point. With a Winfield or Murray at bat, a 70% success rate would be needed. I hesitate to try to get any more specific because as I said, timing is an important factor. I just don't know how important.

GREAT PERFORMANCES

Everybody who has a formula for evaluating offenses has their lists of all-time greats. And I'm no exception But before I inflict my lists on you I need to explain briefly what they're based on. First, I calculate the player's estimated runs produced. Then I project that figure to a full season's play. This puts each performance into a context where they can be looked at and compared in much the same way as are batting averages. And without further ado, here's my first list.

ESTIMATED RUNS PRODUCED TOP 10 LlFETIME PERFORMERS IN ESTIMATED RUNS

		Est. Runs Produced	Avg.
1.	Babe Ruth	202	.342
2.	Ted Williams	194	.344
3.	Lou Gehrig	176	.340
4.	Jimmie Foxx	162	.325
5.	Rogers Hornsby	158	.358
6.	Hank Greenberg	153	.313
7.	Mickey Mantle	148	.298
8.	Stan Musial	146	.331
9.	Ty Cobb	145	.367
10.	Joe DiMaggio	141	.325

TOP TEN SEASONAL PERFORMANCES IN ESTIMATED RUNS PRODUCED PER 162 GAMES

		AB	HR	Avg.	Est. Runs Produced/Season
1.	'20 Ruth	458	54	.376	274
2.	'23 Ruth	522	41	.393	266
3.	'41 Williams	456	37	.406	265
4.	'21 Ruth	540	59	.378	262
5.	'57 Williams	420	38	.388	245
6.	'26 Ruth	495	47	.372	239
7.	'24 Ruth	529	46	.378	237
8.	'25 Hornsby	504	39	.403	231
9.	'27 Ruth	540	60	.356	230
10.	'24 Hornsby	536	25	.424	228

I can't let this list pass without a comment to put it in perspective. Babe Ruth's 1920 performance put into a one-game context would look like this:

	AB	H	2B	3B	HR	BB
'20 Ruth	40	15	2	1	5	13

That is truly astonishing. 15 runs per game. 5 home runs per game. 13 walks per game. That is what 274 estimated runs produced means. (Runs created per 162 games would be 324, or 18 runs created per game.)

TOP TEN SEASONAL PERFORMANCES IN ESTIMATED RUNS PRODUCED PER 162 GAMES (1970-1983)

		AB	HR	Avg.	Est. Runs Produced/Season
1.	'80 Brett	449	24	.390	190
2.	'76 Morgan	472	27	.320	179
3.	'75 Morgan	498	17	.327	173
4.	'81 Schmidt	354	31	.316	172
5.	'70 McCovey	495	39	.289	168
6.	'71 Aaron	495	47	.327	168
7.	'79 Lynn	532	39	.333	168
8.	'70 Yastrzemski	565	40	.329	167
9.	'77 Carew	616	14	.388	166
10.	'70 Carty	48	25	.366	163

ADDITIONAL ESTIMATED RUNS PRODUCED FORMULAS

Besides the estimated runs produced formula already introduced there are a couple of other versions that I find useful. The first of these is the simplest. It works very well for groups of statistics for which you don't have all the minor statistics like caught stealing, hit by pitch, etc.

(2 x (TB + BB) + H + SB - (.615 x (AB - H))) x .16 = Runs

The second version works better, especially for players with high stolen base totals, for figuring estimated runs produced projected out to a full season's basis.

(2 x ( TB + BB) + H + SB - ( .610 x (AB + (SB/4) - H))) x .16 = Runs

Then:

Runs/(AB + (SB/4) H) x 458 = Runs per 162 Games

(AB + (SB/4) - H is the number of projected outs made by the player and 458 would be the number of total outs in a season.

To project estimated runs produced from the original equation to a full season this is the conversion:

Runs/(AB + CS + GIDP - H) x 474 = Runs per 162 Games

(AB+CS+GIDP-H) is the number of outs made by the player and 474 would be the number of total outs in a season.

BILL JAMES' AFTERWORD

One thing that I should do here is to explain the details of the runs created formula in use this year, which I don't think I have done anywhere else in this book. The runs created estimates in this book arc derived by what is called the technical version of the runs created formula. which is the same this year as was introduced in the 1984 Abstract on pages 12-15. The formula has three elements, an A element, a B element, and a C element. The three are put together in this way:

(A x B) / C

The A factor, which measures runners on base, is hits plus walks plus hit batsmen minus caught stealing and grounded into double plays (H + W + HBP - CS - GIDP).

The B factor, which measures advancement of baserunners, is total bases plus .26 times hit batsmen and non-intentional walks, plus .52 times stolen bases, sacrifice hits and flies (TB + .26(TBB-IBB+HBP) + .52(SB + SH + SF))

The C factor. which measures the context in which these things occur, includes at bats walks, sacrifice hits and flies, and hit batsmen (AB + TBB + SF + HBP).

That's called the technical runs created formula: same as last year's. I don't know why it took four pages to explain last year.

The first thing that I did to try to verify Paul's method was to run his formula on the 1984 season, the statistics of which were not even available at the time that he wrote the letter. His method, for the 1984 season, was little more accurate than runs created. I checked 1983. Again Johnson's method finished a nose in front of runs created. At that point I thought I had better get serious about checking it out. so I designed a ten-league, 100-team test. The ten leagues were both leagues for the seasons 1955, 1960, 1965, 1970, and 1975.

Runs created beat estimated runs produced in seven of the ten leagues and 56 of the 100 teams: still, because the errors of runs created were significantly larger than those of Johnson's method, his system came out well ahead in the gross error. The gross error for the 100 teams was 1,840.2 runs for Johnson's method (18.4 runs per team); that for runs created was 1,934.8 (19.3 per team).

At that point, I decided I should ask him to present the formula in this book. I'm not convinced that his method would be more accurate than runs created in a larger study; I'm certainly not convinced that it wouldn't. Runs created seemed to be more accurate in the period before the stolen base revolution began; it "won" both leagues in 1955 and 1960 and the American League in 1965.

I don't know that the degree of accuracy involved makes a lot of difference. The real appeal of his method, to me, is its simplicity; it involves just seven categories of information and no calculations except addition, subtraction and multiplication. I was originally suspicious of the system when I saw that ".16" at the end of it. Wouldn't it seem more likely that the most accurate possible system would require multiplication by .15974 or something? My assumption, as I said, was that if better methods were to be developed, they would have to be more complex, more difficult to figure, and that they would grow out of the existing methods. The excitement of finding Johnson's method is that 1) it is so simple, and 2) it was developed entirely independently. These two things suggest that there probably are compromises between the two methods that will prove to be yet more accurate than either method.

But not that much more accurate. Another thing that I noticed in comparing the two methods is that the correlation between the two of them was even closer—much closer—than the correlation of either with actual runs. The two methods tend to make the same mistakes—that is, the 1975 Red Sox actually scored 796 runs. Johnson's system says they should have scored 768 runs, and mine 769. That happens often, and, since we are talking about two completely independent methods, that suggests that we are nearing the limits of the information that exists within traditional batting stats. The errors in Palmer's method, on the other hand, seem to be completely unrelated to the errors of runs created.

I feel certain that Paul's method will find many uses in sabermetrics. I've known for a little over a year that the runs created formula had a problem with players who combined high on-base percentages and high slugging percentages—he is certainly correct about that—and at the time that I heard from him I was toying with options to correct these problems The reasons that this happens is that the players' individual totals do not occur in an individual context. How do I explain this. . . visualize a player's runs created as a rectangle, of which the two dimensions are the ability to get on base and the ability to advance runners. The rectangle representing Eddie Murray is much larger both ways than that representing an ordinary player.

The increase in runs created that results from the extension of the vertical axis is real. The increase in runs created that results from the extension of the horizontal axis is real. The increase in runs created that results from the extension of the one acting upon the extension of the other is not real; it is a flaw in the run created method, resulting from the player' s offense being placed in an individual context. Does that make sense?

What I was thinking of doing was figuring "context" runs created—that is, Eddie Murray's runs created would be figured as the difference between the Orioles' team runs created with Murray, and their team runs created without him, with his statistics taken out. That method in effect prevents the two extensions from acting upon each other, and thus results in runs created estimates which are more accurate for the Eddie Murray, Babe Ruth type of hitter. The runs created estimates for Murray, Sandberg, Murphy, etc., that would be derived by the use of the "context" method would be almost identical to Paul Johnson's estimates for them.

So I know he's right about that. I suspect he's also right about the stolen base adjustments, though I'm less convinced. I'll make some adjustments to the runs created formula within the next year or so. Right now, I don't know what they will be.

Excerpted from 1985 Bill James Baseball Abstract. Ballantine Books. 1985. James, Bill. "Paul Johnson's Estimated Runs Produced".

Back to the top of page | BTF Essays Page| BTF Homepage | BaseballStuff.com