Saturday, August 22, 2009

Advanced Statistics (Defense)

Probably the most overlooked, undervalued, and most difficult portion of the game of baseball is defense. Up until recently (around the last 5 years) teams and fans did not realize how important good defense was. This ignorance towards defense can be attributed to two things. The first is the era we are living in. As you know the last 15 years have been by far the most potent in terms of offensive production due to the advancement in PED's and realization of players that staying in shape reduces injuries and lengthens careers (which amounts to more money).
Now I could go into a whole discussion on why the way we measure defense now (a least in the mainstream) is wrong but instead I will just show you the way advanced statistics are tackling defense now.
There have been multiple ways created to measure defensive values in the last few years. A couple major ones are Dewan's Plus/Minus system and Mitchel Lichtman's Ultimate Zone Rating, known as UZR. While the Plus/Minus system is a huge step over fielding percentage and the such I believe that UZR is the best system that advanced statistics has in place. So that is what we will use. So the obvious question, how does UZR work?
The baseball field is first broken down so there are 64 zones. Almost all of this work is done by computers. What a computer will do is track the number of hits in each zone, the run value of the hit in that zone, and the number of outs recorded in that zone (for each position) on a league-wide basis. The computer then tracks each player at a fielding position and measures The number of hits in that zone while the player was on the field at that position and The number of outs recorded by that player, at that position, in that zone. Here is a example used by Michael Lichtman to help you understand how UZR works, it was written in 2003.

"Let's use the data to calculate Mike Bordick's UZR runs in zone 56 (the area between third base and shortstop). First we establish the out rate for all ground balls hit into zone 56. That is 1419 divided by 2474 (1419 plus 1055), or .57. That is, 57% of all ground balls hit into zone 56 in 2002 were turned into outs (by all fielders). Therefore, the "extra" value of a "caught ball" by a fielder in zone 56 is 1 minus .57, or .43 balls. Since Bordick caught 18 balls in zone 56, he has 18 times .43, or 7.7 "extra" caught balls so far.

Now what about the hits? There were 79 hits in zone 56 while Bordick was playing SS. Surely he is not responsible for all of those hits. How many is he responsible for? Well, since an average SS catches 294 balls in zone 56 out of 1419, or 20.7% of the outs, Bordick is responsible for 20.7% of the 79 hits as well, or 16.4 hits (the third baseman is responsible for the other 62.6 hits). I told you it was going to be tricky! Now, just like the "extra" positive value of a "caught ball" is 1 minus .57, the "extra" negative value of a hit is the .57 itself (an average ball hit into zone 56 gets caught 57% of the time, so when a ball isn't caught, the responsible fielders, in this case the SS and third baseman, get "docked" .57 balls). Since Bordick is responsible for 16.4 of the 79 hits in zone 56, he has 16.4 times .57, or 9.4 "negative" caught balls added to his 7.7 "positive" ones, for a total of -1.7 "extra" caught balls. In other words, given the number of balls hit into zone 56 while Bordick was at SS, he caught 1.7 fewer balls than the average SS in the AL in 2002.

Now we want to convert those "extra" balls into runs saved or cost. For that, we use the average run value of a hit in zone 56 - which is .47 runs. Since a 2002 AL out is worth -.29 runs, the "swing" between an out and a hit is .47 plus .29, or .76 runs. Since Bordick caught 1.7 fewer balls in zone 56 than an average SS, he has cost his team 1.7 times .76, or 1.3 runs so far (i.e., his UZR runs in zone 56 is 1.3). If we do this for every zone in which any SS made at least one out (i.e., the applicable SS zones), and we add up all the runs Bordick saved or cost in each zone, we get a total of +6.2 runs, or 6.2 runs saved by Bordick while playing SS (he must have done well in the other zones)."

A lot to read I know, but absolutely incredible stuff. There are other portions that add up to this as well. They are errors, turning double plays for infielders, and arm strength for outfielders. Let's just focus on errors. Here is some more of Michael Lichtman's work:

"The average SS committed 169 ROE (reached on base errors) errors in 5218 balls gotten to (outs plus ROE's) in all zones. That is an error rate of 169 divided by 5218, or .032. Since Bordick got to a total of 277 balls in all zones, he should have committed .032 times 277, or 8.9 errors. Instead, Bordick committed only 1 error, for a net gain in errors of 7.9. Since an infield error is worth around .49 runs, the swing between an error and an out is .49 plus .29, or .78 runs. Therefore, Bordick saved another .78 times 7.9, or 6.2 runs, by virtue of his "good hands". So far, we have Bordick saving 6.2 runs with his range and another 6.2 runs with his sure hands.

There is one final thing to consider Bordick's non-ROE errors. Like ROE errors, that is easily done.

The average SS committed 45 non-ROE errors and Bordick none. If we do the same calculations as above, using .3 as the value of a non-ROE error, we come up with Bordick saving another .72 runs. So it looks like even at the ripe old age of 36, Mike Bordick saved his team last year a total of 13 runs by virtue of his outstanding play (range and hands) at SS!"

So in determining infield defense you measure the range runs (RngR, error runs (ErrR and double play runs Dpr all together and come up with a single number, which is UZR runs. For outfielders it''s RngR, ErrR, and Arm strength runs (ARM). For example let's add up Nyjer Morgan's season this year in terms of UZR. Here are the numbers: ARM-10.1, RngR-17.9, ErrR-0.6. When you add this up you get 28.7 runs above average. Nyjer Morgan, under UZR, is the best defensive outfielder in baseball.

A few things UZR doesn't do, at least for now, is measure pitcher and catcher defense, the park the game is played in, the speed of the ball, groundball/flyball tendencies by the pitcher, batter handedness, combinations of runners out/ runners on base.

Fangraphs.com has the UZR ratings of every player in baseball in case you would like to see some more examples. The data only goes back to 2002, so don't go searching for Willie Mays' UZR in 1954. The next section will deal with positional adjustments.

Also here is a link to Michael Lichtman's own article onUZR. http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2003-03-14_0/

1 comment: