A perennial question in baseball deals with the relative importance of hitting and pitching. Sometimes pitching is said to be “75%” of baseball. This study looks at the run scoring ability and run preventing ability of division winning teams from 1969-1989. In general, division winners are better than the average team at scoring runs and preventing runs, but their biggest advantage over the average team tends to come in scoring runs. In fact, the bigger advantage that the division winners have in run scoring may be statistically significant. That is, hitting is more important than pitching.
In one study, James K. Skipper, Jr. surveyed the opinions of general managers, field managers, and broadcasters. The average of the three groups had pitching was 59.5% of winning, hitting 21.6%, fielding 14.6%, and a variety of other factors 4.2%. No statistical analysis was used by the survey respondents in stating these percentages.
In another study, A. Brian Ault performed a regression analysis covering the 1973-1983 period in which the number of games a team won was a function of E.R.A., runs scored and fielding percentage. In the A.L., ERA accounted for 37.76% of wins, runs scored 41.06%, fielding percentage 10.42%. In the N.L. ERA accounted for 41.94% of wins, runs scored 35.31% and fielding percentage 7.76%. Hitting or run scoring contributed much more than the survey data above suggest. It was almost equal with ERA in importance.
Two potentially minor problems with this study exist. One is the actual validity of the data. Teams do not compile their statistics independently. When one team scores a run, another team gives up a run. Was that run scored due to poor pitching or good hitting? So did a team win a game because of its good pitching or another team’s bad hitting? I don’t know. But this may be a minor problem and I don’t think I know how to tell how serious a problem it might be.
The other problem is with how he assigned the percentages mentioned above. In regression analysis, the computer does not specifically tell us these percentages. They come from doing three separate regressions, in which the first sees wins as a function of E.R.A. only. The computer tells us what percentage of wins is explained by E.R.A. Then a second regression is performed in which wins is dependent on E.R.A. and runs. Then the computer tells us what percentage of wins is explained by E.R.A. and runs scored. The percentage from the first equation is subtracted from this second percentage. The remainder is then the percentage of wins explained by runs scored.
For example, if the first regressions tells us that 40% of wins is explained by E.R.A. and the second tells us that 70% of wins is explained by E.R.A. and runs scored, the conclusion may be that runs scored contributes 30% of the wins. None of this is explained in the Ault article. It is possible if the first equation used runs scored and the second added in E.R.A. that we would see runs explaining 40% of wins. Ault does not specify what order variables were added in or if he did regressions in which each variable stood alone and he took averages of the percentages of wins explained. So the conclusion he reaches may not be warranted.
Methodology of this study
Since coming in first place is an important objective of all teams, how hitting and pitching contribute to division winners is analyzed. This avoids the problem of data not being independently determined. The relevant question, therefore, is do the division winners come in first place because of superior hitting or pitching.
To measure a team’s ability in scoring and pitching (or run preventing), data from road games only was used. This avoids home field distortions. A team may seem to score alot of runs because it plays in a good hitters park while not actually having a good hitting team. The same is true for pitching.
Since road data is used, I had to use runs allowed to represent the ability of a team’s pitching staff to prevent runs from scoring since my baseball encyclopedia does not give E.R.A. for road games. Also, fielding percentage is not the best measure of fielding and runs scored and E.R.A. are generally related.
Two measures of a team’s ability were used. A team’s number of runs scored and the number of runs allowed were simply divided by the league average to show percentage differences. The other way was to see how many standard deviations from the mean each team was in scoring runs and preventing runs.
The farther a team is from the mean of either runs scored or runs allowed, the greater its ability in those two categories.
The strike year of 1981 was not included in the study.
Of the 80 division winners from this time period, the average team scored 11.45% more runs than the league average. The average division winner gave up 7.3% fewer runs than the league average. So on the percentage measure, division winners are clearly just good pitching teams, yet they are even better hitting teams. Their advantage in hitting is 57% higher than it is in pitching.
A few examples are pertinent. The 1982 Brewers scored 31% more than the average while actually giving up 6% more than the average. So this was a great hitting team and a poor pitching team. The 1971 Pirates scored 27% more than the average while giving up 1% less than average. This was a great hitting team and an average pitching team. On the other side, the 1989 A’s scored 1% fewer runs than average while giving up 16% fewer runs than average. A great pitching team, yet an average hitting team.
A more refined measure of relative strength in hitting and pitching would use standard deviations. Here the results are the same. The average division winner was .9605 standard deviations above the mean in runs scored and .764 standard deviations below the mean in runs allowed. In fact, 43 of the 80 division winners actually had a bigger advantage over the league in hitting than they did in pitching.
Looking at the three teams mentioned above, the 1982 Brewers were 2.298 standard deviations above the mean in runs scored while being .7059 standard deviations above the mean in runs allowed. The 1971 Pirates were 1.95 standard deviations above the mean in runs scored and .095 standard deviations below the mean in runs allowed. The 1989 A’s were .16 standard deviations below the mean in runs scored and 1.571 standard deviations below the mean in runs allowed. For these three teams, the results are the same as they were for the percentage analysis.
Finally, an important question is the difference between the average standard deviation above the mean of .956 for runs scored and the average standard deviation below the mean of .767 for runs allowed statistically significant? If pitching and hitting are of equal importance, the average division winner would be just as far above the mean for runs scored as they are below the mean for runs allowed.
Let X1 = the average standard deviation above the mean for runs scored
Let X2 = the average standard deviation below the mean for runs allowed
The null hypothesis is X1 = X2. This also means that hitting and pitching are of equal importance.
Using a two sample test of means, the test statistic calculated was 1.40. The probability of getting 1.40 in this case using a normal distribution is .1616. This is not significant.
A similar test comparing the percentage differences in runs and runs allowed gave a test statistic of 2.85 which is significant at the .0044 level. In this case, the null hypothesis should be rejected.
Hitting was more important than pitching to the division winners in major league baseball from 1969-1989. This is only one time period in baseball history. It is important to look at all time periods to see if the relative importance of hitting and pitching changes over time. It would also be useful to know why, if it in fact did. Some very cursory analysis shows that in earlier time periods, pitching might have been closer in importance, if not more important than hitting. I hope others will look into this. I will, when I get the time.