# Baseball Statistics Glossary

##### Offensive Statistics
AVG (also BA)
Batting Average — hits divided by at-bats.  Probably the most common “rate” statistic in baseball.  Has been historically used as the benchmark for great hitters.  Advantages: commonly understood scale (everyone knows what at .300 hitter means), historically entrenched, reasonably easy to understand.   Disadvantages: neglects walks and extra-base hits.  Doesn’t correlate with offense as well as other measures.
OBP (also OBA)
On Base Percentage — Originally created by Branch Rickey and Allan Roth in the 1950’s as OBP = (H+BB+HBP) / (AB+BB+HBP).  The official stat, as adopted in 1984, includes sacrifice flies in the denominator: OBP = (H+BB+HBP) / (AB+BB+HBP+SF).  It is thus possible to have a lower OBP than AVG if your sac flies exceed your walk total.  OBP is an incredibly important statistic, since the one scarce resource in a baseball game in outs, and OBP measures the rate at which outs are consumed.  League average OBP’s have recently been around .340, in contrast to the long-term average of .320-.330.
SLG (sometimes SA)
Slugging Average or Slugging Percentage.  Defined as Total Bases per At-bat.   Total bases are 1 for a single, 2 for a double, 3 for a triple, and 4 for a home run (no extra credit is given for runners on base).  Thus, the formulae for SLG are: SLG = TB/AB or SLG = (1B + 2*2B + 3*3B + 4*HR)/AB or SLG = (H+2B+2*3B+3*HR)/AB.  League average SLG in the 90’s has generally been around .420, though earlier eras were more like .390-.400.
ISO
Isolated Power: extra-bases per at-bat.  Invented by Rickey and Roth in the 1950’s.   Extra bases are 0 for a single, 1 for a double, 2 for a triple, and 3 for a home run.  ISO = SLG – AVG, or ISO = (TB-H)/AB
TBP
Total Base Percentage: Total Bases per Plate Appearance — similar to SLG, but uses the denominator from OBP (AB+BB+HBP) rather than just AB.  Not a commonly used stat.
OPS
Onbase Plus Slugging.  OPS = OBP + SLG.  Not a true rate statistic, since it combines rate stats with different denominators, but it is a very common, handy “index” to rank player production by.  Often, the decimal point is omitted, thus a .380 OBP + .450 SLG = .830 OPS is written as simply 830 OPS.  League average is roughly 750, top players will exceed 1000, while the worst regulars are generally around 600.
PRO+ (sometime PRO or APRO)
Adjusted Production.  A park- and league-adjusted version of OPS.  Used by Total Baseball, and thus often quoted in comparing players from different eras.   Defined as PRO+ = (OBP / LgOBP) + (SLG / LgSLG) -1, where OBP and SLG have been adjusted for the player’s home park, and LgOBP and LgSLG are the league average OBP and SLG, respectively.  A PRO+ of 100 is league average  (as in OPS, the decimal point is omitted).  Often thought of as the “percentage” better than league average that a player’s rate of production was — e.g. a PRO+ of 120 is “20% better” than league average, while a 95 PRO+ is “5% worse”.  The percentages are not strictly true, but it’s a useful approximation.
ABR or BR/A (or sometimes just BR)
Adjusted Batting Runs.  Part of Total Baseball’s Linear Weights (LWTS) system.   A measure of the number of runs batter generates beyond what an average batter generates in the same number of plate appearances, park-adjusted.  The formula is:
Runs = (.47)1B + (.78)2B + (1.09)3B + (1.40)HR + (.33)(BB + HB) – (.25)(AB-H) – (.50)OOB
The -0.25 value for an out is modified depending on the league and season in question such that the league ABR total is zero (thus an ABR of zero means the batter was exactly average).
RC
Runs Created — Bill James’s model for offensive production.  In its most basic form, RC = OBP*TB.  The intuitive explanation is that scoring runs consists primarily of two actions: getting on base or creating baserunners (the OBP term), and power or advancing runners around the bases (the TB component).  Note that since SLG = TB/AB, RC can also be written as RC = OBP*SLG*AB, which is the justification for the use of OPS as a useful index.  There are a variety of more technical versions that include stolen bases, double plays, and correct for the lack of complete data in years past.   Runs Created is most accurate for teams, but is often used to measure individual production.  It has a tendency to overrate top offensive players, who get “double-credit” for “driving themselves in” when combining their own OBP and SLG.
RC/25 and RC/27
Runs Created per 25 (27) Outs — the rate statistics associated with Runs Created.   It is derived by simply taking the total Runs Creatd, and dividing it by the number of outs the player made (AB-H, sometimes adding CS or GIDP depending on the available data), and multiplying by 27 outs. The resulting figure is an estimate of how many runs a team made up of 9 clones of the player would score per game.  Sometimes 25 outs is used instead of 27 since there are some outs (such as baserunning outs) that do not appear in the batting line.  Approximately 25 outs per game (of 27 total for a full 9 inning game) are accounted for by the traditional batting stats on average, hence the popularity of RC/25.
MLV and MLVr
Marginal Lineup Value (and rate) — based on Runs Created, MLV corrects the shortcomings using of RC for individual players by estimated how many additional runs an average team scores if you replace an average batter with the individual.  This eliminates the “batting yourself in” flaw in individual RC, and correctly estimates the impact of the extra team plate appearances a player creates with a high OBP.  A full explanation is available in the Stathead Baseball Engineering Library at http://www.stathead.com.  MLV is the total number of runs a player adds over the number of games he plays over the course of a season, while MLVr is a rate stat, and measures MLV per game.
SBR
Stolen Base Runs  SBR = .3*SB-.6*CS or SBR = .3*(SB-2*CS).  Total Baseball’s approach to quantifiying base-stealing.  Numerous statistical studies show that the breakeven success rate for steals (the rate at which attempting to steal is neither helping nor hurting the team in terms of total runs scored) is about 67% — below that you are costing your team runs.  Each successful steal adds about .3 runs to a team’s total — far less than in generally believed.  SBR estimates the impact of base-stealers, which, other than the elite base-stealers, rarely amounts to more than a few runs per year.
MLE
Major League Equivalency — Bill James/STATS method for translating minor league performance to what the such player would have hit had he been in the majors.  MLE’s normalize past performance, rather than predicting future performance.  The method for MLE’s has never been fully published, though James describes a precursor to the current approach in his 1985 Baseball Abstract.  It involves making an “league difficulty” adjustment to a player’s stats to a standard major league level similar to how park-adjustments are done to normalize a player’s performance to a neutral park.
EqA (aka Davenport Translations)
Equivalent Average — Invented by Clay Davenport, and published in the Baseball Prospectus, EqA combines an alternate system to MLE for translating minor league performance with a normalizing step that converts translated performance to a single number on the same scale as batting average.  E.g. a minor league player may hit .300/.400/.500 in the AAA Pacific Coast League.  Given that the PCL is a high-offense league, that might translate to a .265/.330/.440 major league equivalent performance, which in turn is summarized as a .270 EqA, which is slightly above the average major league performance (set at .265).  EqA has an advantage over MLE’s in that EqA’s are computed down to A-ball, and thus can be used to track players throughout the minors.   There is also somewhat more information published on the EqA method in the Baseball Prospectus than on the James MLE method in the STATS Handbooks.

##### Pitching Statistics
ERA+ (and RA+)
Park and League Normalized ERA (or RA, which can be substituted for ERA throughout this explanation).  ERA+ = LgERA / ERA, where ERA has been park-adjusted (and the decimal dropped).  This is the pitcher’s counterpart to PRO+, and measures how well the pitcher prevented runs from scoring relative to the the rest of the league (and factoring in the home park he plays in).  Note that while lower ERA numbers is better, the direction is reversed for ERA+, meaning higher ERA+ represents better performance.   As with PRO+, 100 is league average, and 120 is 20% better than average.
APR or PR/A
Adjusted Pitching Runs.  APR = IP/9 * (LgERA – ERA), where ERA has been park adjusted.  Measures the number of runs a pitcher prevents from scoring compared to a league average pitcher in a neutral park over the same number of innings.  The quantitative counterpart to the ERA+ rate stat.
WHIP (aka Ratio)
Walks and Hits per Inning Pitched.  WHIP = (H+BB)/IP, sometimes stated per 9 innings —
WHIP = (H+BB)/IP*9.  Also known as BaseRunners/Inning (or BR/9).  Rose in prominence with the popularity of fantasy leagues, particularly Rotisserie leagues, where it is one of the pitcher’s statistical categories.  This is analgous to a pitchers-OBP (though a pitcher’s OBP can be computed with given complete data, WHIP is a more commonly seen substitute).
Game Score
GS = 50 + 3*IP -2*(H+R+ER)-BB+SO  + (+2/each full Inning completed starting with the 5th)
Bill James invented this pitcher’s per-start metric as a response to the adoption of Quality Starts, which James felt were to coarse a measurement for evaluating the quality of a pitcher’s outing.  Game Score is a useful rule of thumb, since it is calculable from a box score, but it is not sabermetrically precise (e.g. strikeouts are weighted twice as much as other outs).  It is commonly thought of a measure for how “dominant” a pitcher was in the game (sometimes colloquially known as “Ryanicity” since Nolan Ryan and his multiple no-hitters were among the best Game Scores measured).  Kerry Wood’s 20-strikeout,1 hit shutout in early 1998 was the highest Game Score ever recorded for a 9 inning game coming in at 105.  A variant of Game Score includes a -1 penalty for HBP and Balks, though this gets away from James’s original intent, which was to have GS calculable from the items in a typical box score pitching line.
QS (Quality Start)
A game started in which a pitcher lasts for six innings or more and allows three runs or less.  Some later variants make it 6 innings with <=2 runs allowed, or 7 or more innings with <=3 runs allowed.  A rough rule of thumb for separating good starts from bad ones, with the noble intent of crediting pitchers with how well they pitched and removing run support from consideration (the major problem with W-L records).  QS are controversial because the boundary case (6 IP, 3 ER) leads to an unimpressive ERA of 4.50, which critics argue is not “quality”.  While the point is well-taken, it is still a useful dividing point because (a) most quality starts are not of the 6&3 variety, (b) 6&3 is an easy to remember and use midpoint and (c) it is relatively close to what a league average performance (a natural dividing line between good and bad) would be, particularly in the higher offense era of the mid 90’s.
##### Defensive Statistics
DA (Defensive Average)
The Baseball Workhop/Project Scoresheet has been tracking the location of every batted ball in every game for the past several years”Defensive Average”, or DA, is the rate at which fielders turn balls hit into their zones into outs. DA is analagous to a fielder’s “batting average against” in that it measures times reached base (by hit or error) per opportunity (ball hit into his zone). The areas of responsibility for DA span the entire field — no portion of the field in considered to be beyond the reach of some fielder.  More info on DA is available in the Defense section of the Stathead Baseball Engineering Library.
ZR (Zone Rating)
STATS devised their own system of zones to track locations of batted balls. STATS uses this data to measure a fielder’s “Zone Rating”, or ZR. Zone Rating differs from DA in several ways, notably in that an outfielder is credited with two outs on a double play (DA counts just the fact that a batted ball was converted into an out, without attempting to measure the value of the 2nd out). The areas of responsibility for ZR are smaller for each fielder than in DA. ZR areas of responsibility do not span the entire field — some areas (for example, deep in the gap between CF and RF) are considered to be a “no man’s land” that is ordinarily beyond the reach of fielders, and thus a ball hit there is not considered an opportunity.
RF
Range Factor, RF = (PO + A) / Innings Played * 9 (sometimes computed as RF = (PO+A)/Games Played) Range Factor is, essentially, the number of plays made per 9 innings (or game).  It is a somewhat better measure of defense than Fielding Percentage, in that it does take into account how well a fielder gets to a batted ball.  RF is calculable for most of baseball history, and as such is probably the best tool we have for assessing historical defensive performance.  RF’s are generally only comparable at particular positions (i.e. you can compare a 3B to another 3B, but not to a SS or a CF).   RF is sometimes adjusted for the number of strikeouts the pitching staff makes in front of the fielders (more strikeouts means fewer outs made by the fielders).  The major disadvantages of RF are that there is no real control for opportunities (you can’t tell whether, say, a shorstop’s RF is high because the he was an excellent fielder, or because he played behind a extreme groundball pitching staff).
Fielding Percentage (aka Fielding Average)
Defined as F% = (PO+A)/(PO+A+E).  PO = Putouts, A = Assists, E = Errors.   Fielding percentage is the most common defensive rating in baseball today, which is unfortunate because it is probably the least accurate of any major commonly used statistic in baseball.  Why?  It doesn’t measure what it claims to measure (success rate of fielding opportunities).  If you accept that the primary role of a fielder is to convert batted balls into outs, then fielding percentage measure only those batted balls which the fielder could easily each, and only those failures to convert them that the official scorer deemed should have been made with ordinary effort.  A lack of range or mobility does not show up in fielding percentage.  Indeed a fielder who does not attempt to make a difficult play fares better than an Ozzie Smith-type who can get to the ball, but may get charged with an error in trying to complete the play.  The difference in errors among major league players in the modern era is so low (amounting to a handful of balls per year), that differences in range dominate the true differences in defensive performance, yet fielding percentage essentially ignores it.  The keys here is that opportunities are not independent of the fielder’s own performance, and that the most frequent kind of failure is not penalized. Avoid using fielding percentage if at all possible.
Fielding Runs
Fielding Runs (or FR) are a generic term for any statistical treatment of fielding that converts a fielder’s performance to runs.  The most common approach is that used by Total Baseball. Basically, TB’s approach involves weighting the number of putouts, assists, and double plays made by each fielder, and comparing the totals to positional norms to come out with a figure above/below average.  Each extra out made (or hit allowed) is worth X runs in the LWTS model, which leads to the FR figure.  The disadvantages of the TB method are similar to those of Range Factor, and the FR figures are considered particularly unreliable for catchers and first basemen.  Career figures are also considered better indicators than individual seasons.

There are also FR figures based on Defensive Average, which are considered more reliable in the few years where DA is available.  In principle, FR could be done for ZR also, but I am not aware of any published attempts at doing so.
##### Advanced, Positional, Normalizing or Total Value Statistics
VORP (and VORPr)
Value Over Replacement Player (and rate) — developed by Keith Woolner of Stathead Consulting, VORP measures how many runs a player would contribute to a league average team compared to a replacement level player at the same position who was given the same percentage of team plate appearances as the original player had.  VORP, in its most advanced form, uses MLV to estimate the change in team run scoring attributable to the player’s performance.  A full description of VORP, MLV, and replacement value is available at the Stathead Baseball Engineering web site.
VORPD
VORP + Defense.  VORP implicitly assumes all players are average defenders at the position they play.  This clearly is not the case for all players.  All versions of Fielding Runs in use today are intrapositional — that is, they compare a fielder to others at the same position.  Thus, you can add their FR total directly to VORP to get a more accurate total reading on a players overall value (any of the FR methods will work, depending on your preference).
PMLV and PMLVr
Positional Marginal Lineup Value (and rate).  Based on Marginal Lineup Value (MLV and MLVr), PMLV measure the runs produces by a player beyond what an average player at his position would produce (rather than a league average hitter, regardless of position).   Computed simply as MLV(player) – MLV(positional average player with same PA%).
RIG family (BRIG, SPRIG, RPRIG)
Rate Index Grades — a family of “Report Card” rate statistics developed by Tom Fontaine of Stathead.  The RIG method takes the pool of starting players (or starting pitchers, or relief pitchers), park-adjusts their stats, and measures the mean and standard deviations of performance for a variety of categories (e.g. for batters, some of the stats include AVG, OBP, SLG, but also walk rate, strikeout-to-walk ratio, etc.).   Players are assigned a letter grade in each category based on how many standard deviations above or below the mean they are relative to their peers.  The resulting “report card” of performance can be easily used to identify a player’s strengths and weaknesses, or trends over several seasons.  A full description of the RIG family is available in the Stathead Baseball Engineering Library
LWTS (or LW)
Linear Weights — Thorn and Palmer’s system of measuring baseball production by assigning a coefficient (or weight) to each event and adding up the sums (a linear combination).  Batter Runs (BR and ABR), Pitcher Runs (PR and APR), Fielding Runs (FR), Total Player Rating (TPR), etc. are based on this system.  LWTS is used extensively in Total Baseball, and is a convenient model for a variety of analyses.   The basic LWTS formula is:
Runs = (.47)1B + (.78)2B + (1.09)3B + (1.40)HR + (.33)(BB + HB) – (.25)(AB-H) – (.50)OOB
TPR
Total Player Rating — Total Baseball’s ultimate stat, which represents the number of wins above average attributable to a player.  This is roughly equal to the sum of a player’s Batting Runs, Pitching Runs, and Fielding Runs (park adjusted, and considering position) divided by 10.  The exact divisor depends on the level of league-wide offense in a given season, but is almost always between 9 and 11.
Park Factors
Park Factors (or PF) are used to control for the effect a player’s home park has on his overall raw statistics.  There is a great deal of confusion on park statistics, in part because they are used for two similar, but distinct purposes.  The most common use of park factors is to adjust for the value of a player’s performance by recognizing that individual runs are less valuable in hitter’s parks (because it take more of them to earn a win on average) and more valuable in pitcher’s parks.  This is sometimes thought of as the “run-currency” approach to park factors.
Coors-runs are not the same as Astrodome-runs, because they purchase the desired result (wins) at different rates (e.g. it may take an average of 6 Coors-runs to purchase a win, but only 3.75 Astrodome-runs on average) .  Using park factors in this way translates all runs earned into a “common” currency — usually league-average runs.A second and distinct use of park factors is to project what a player’s raw production would have been in a neutral park (or a particular park, say guessing what Dante Bichette would hit in Fenway).
In general this involves looking at the components of a player’s production (i.e. how many of his hits were home runs, doubles, singles, how often he struck out, etc.) and looking at a park’s particular effect on that kind of production.   When considering a player moving between parks, the park effects come into play twice — once in converting a player’s actual stats to a neutral park as a baseline, then converting into the new park.  This kind of analysis is considerably more difficult than the “value” use of park factors, and is done more rarely.The most common form of park factors are computed by dividing the home total of some stat of interest by the road total (assuming equal numbers of games played at home and away).  E.g. If Red Sox home games saw 847 runs scored (between both teams), while Red Sox road games saw 770 runs scored, the the Park Factor for Runs in Fenway would be 847/770 = 1.10 (usually written without the decimal point as 110).  This means that 10% more runs were scored during games at Fenway than with the same teams (the Red Sox and their opponents) away from Fenway.  Park factors can be computed thusly for any stat (HR, SO, etc.), but the Park Factor for runs is the most common.

When adjusting a player’s statistics, the Park Factor is usually halved, reflecting the fact that only half of the average player’s games are played at home, while the rest are in the other league parks (usually assumed to be league average when all road games are aggregated).  E.g. If a player had a raw RC/27 of 7.40 and played in a hitter’s park with a PF of 108, the actual park factor used would be 104 (half of the 8% inflation in the hitter’s park), and the adjusted RC/27 would be 7.40 / 1.04 = 7.12.  Similarly a pitcher in the same park who’s ERA was 3.80 would see an improvement in his adjusted ERA of 3.65 ( = 3.80 / 1.04).  Park Factors for pitchers’ parks are < 1 (or less than 100 when omitting the decimal), and serve to boost hitter’s numbers while penalizing pitcher’s numbers.

There are more advanced forms of park factors, which adjust for the differences in batting and pitcher factors, computing road park factors that normalize for the fact that treating the road as league average is not completely precise, and more.