Back of the Envelope: January 2008

Monday, January 28, 2008

Lowered Expectations, Part II

In my last post, I looked at eBABIP (LD% + .117) as a predictor of actual BABIP, and my main findings were:

-There is some year-to-year stability in size of the residual between eBABIP and BABIP for a given player.

-eBABIP is slightly more effective on good hitters with relatively more playing time, and slightly less effective on below average hitters with fewer PAs. Specifically:

Best prediction quartile (means): .269/.336/.429, 430 plate appearances
Worst prediction quartile (means): .262/.326/.411, 304 plate appearances
League 2005-2007 (min. 100PA): .267/.333/.421, 391 plate appearances

Today, I said I'd run some regressions to get a more precise idea where eBABIP does its best predictive work. So I looked at playing time by 50 plate appearance intervals, and I looked at offensive production by OPS in .20 intervals (I'd use EqA, but I couldn't find numbers for 2005-2006). Here's how it went.

As you can see, there is a big jump in predictive power around the .800 OPS line, with the highest R-square values coming from players with at least a .900 OPS (rare enough territory). In terms of plate appearances, there is a more steady increase in predictability starting around the 300 PA threshold and peaking at 500 PA.

From this, one would figure that the best test group for eBABIP would be .900 OPS guys with at least 500 PAs. Turns out that's true. The R-square for the eBABIP/BABIP regression in those cases is .255, which is the highest value I found using these two parameters.

This certainly confirms the quick and dirty hypothesizing we did using the means of the upper and lower quartiles by residual. eBABIP works best on good hitters with a lot of playing time. Still, when you regress the residual with PA and OPS, the latter account for only about 14% of the variance in the former.

Over/Underestimated Players by BABIP

In my last post I also said that I'd eventually look at the real value as opposed to the absolute value of the residual. This number tells you not only how far off eBABIP is, but whether it over or underestimates a player's production. Think of it as giving a vector instead of a magnitude.

So, I ran Pearson correlations between the residual and about 20 different rate and count stats. I found statistically significant relations with many--the highest was with batting average, which came in at. -.490 (meaning that players with higher batting averages tend to have lower residuals and vice versa). OBP and SLG had similar negative relationships at -.358 and -.260 respectively. There was also a slight negative relationship with SO Rate (-.180). But the correlations that really caught my eye were three "speed" numbers: specifically SB (-.226), 3B (-.224) and GB% (-.355). If you don't see my leap of logic in including ground ball percentage as a speed-related number, read on.

In interpreting these relationships, remember that I'm using the actual value of the residual. So, for instance, the residual's negative relationship with stolen bases, triples, and ground ball percentage tells yout that guys with higher rates tend to be underestimated by eBABIP--that is, they tend to have negative residuals. Conversely, guys with lower SB, 3B, and GB% numbers tend to be overestimated by eBABIP--they tend to have positive residuals. What this tells me is that eBABIP underestimates fleet-footed, punch-and-judy types whose ability to beat out ground balls and take extra bases accounts for a large part of their offensive value. Likewise, eBABIP overrates line-drive hitters who are slow on their feet and don't get their share of ground ball hits. I could get a much better idea of the correlation here by computing Bill James' speed score and pitting it against the residuals, but they don't pay me enough (i.e., anything at all) to do it right now.

So instead, I'll look at a few extreme case studies, namely guys who beat their eBABIP by 50 points or more, and see informally if a pattern emerges. First, let's look at guys who were grossly underestimated by eBABIP--guys who had residuals of -.050 or less. Since my data is arranged by player-team-year and covers 2005-2007, a player can appear up to three times, and wouldn't you know it, three guys do:

Willy Taveras, Joey Gathright, and Luis Castillo all hit at least 50 points better than their line drive rates would suggest in each year from 2005-2007. As predicted, all three are slap-happy singles hitters with above average speed. In fact, they are more or less each other's closest active comparables. The guys who make the list twice mostly fit the bill as well:

Derek Jeter
Howie Kendrick
Kelly Shoppach
Pablo Ozuna
Preston Wilson
Ryan Shealy
Tadahito Iguchi
Victor Diaz
Willy Aybar

Though, don't ask me to explain Kelly Shoppach beyond the small sample size: as a backup catcher he never got more than 200 PAs.

On the other extreme, the list of guys who came up short of their expected BABIP by .050 points or more is topped by lead-footed St. Louis backup catcher Gary Bennett, who appears three times. Bennett has a slightly above average LD%, but does absolutely nothing with it. In 2007, he managed just 54 total bases on 44 hits, good for a .221/.298/.271 line. Still, there are no shortage of good hitters who make the list, including Eric Byrnes, Barry Bonds, and Frank Thomas, all of whom appear twice. Like these three, many on the list are slow sluggers or, like Bennett, catchers. The complete stats for all members of the "50 Point Club" can be found here.

The next step would be to get a better handle on the shape of the relationship between these speed factors and BABIP, and to come up with a more nuanced version of eBABIP that reflects them. Fat chance getting that out of me today.

Sunday, January 27, 2008

Lowered Expectations

How to beat the odds and hit above your line drive rate.

One of the more interesting insights to come out of Clay Davenport's marriage of baseball analysis with computer science is this: take one hundred major leaguers who are "true" .270 hitters and give them all 200 simulated at-bats, and by luck alone at least one of them will hit .350 and another .190.

There is a lot of random variance in baseball, enough for all-stars to look like Texas Leaguers and triple-A rejects to look like home run kings given a short enough timeline. And enough that finding ways to separate luck from talent in a given season can be tremendously profitable for general managers.

One of the more popular ways of doing this is by comparing actual to expected batting average on balls in play (BABIP to eBABIP). The former, of course, is calculated by removing homeruns and strikeouts as outcomes of at-bats. So you take (H-HR)/(AB-HR-SO). The latter is estimated by taking a hitter's line drive rate (expressed as a decimal like .182) and adding to it a constant--usually .120--to get a number like .302.

So far as I can tell, the constant is what it is because it's equal to league average BABIP minus league average LD% over some significant stretch of time (over the last three years this number is .117). If there is a more complex way of deriving it, I'm not aware. The idea the eBABIP is intended to capture is that most of a player's hits are going to come off line drives, and that the number of hits a player gets off fly balls and grounders is fairly stable (since defensive efficiency for those outcomes is fairly stable.)

But eBABIP can also be used as an excuse for using generalities to draw fallacious conclusions about individual players, since there will always be some who consistently beat their BABIP expectations by substantive margins.

Well, so I claimed anyway. I figure its only fair to run some numbers and see whether I'm right or stupid. So what I did is this: I took all major leaguers with at least 100 plate appearances in each year from 2005 to 2007 and I calculated BABIP and eBABIP for all of them. I also collected about 100 other rate and count stats for these guys. Then I put the whole mess of it in a big ass spreadsheet and ran a bunch of statistical tests.

First off, I wondered if there was any year-to-year stability in the size of the residual (the difference between a hitter's expected and actual BABIP, expressed as a decimal). If there was, it would indicate that how well or poorly eBABIP predicts BABIP for a given player is not random, but a function of some disposition or capacity that player has. I say "disposition" or "capacity" instead of "skill" because I don't care whether the thing that determines a player's residual is useful, only whether it's tangible.

OK, so I ran three-year intraclass correlations on BABIP residuals and got an "R" number of .276. The R-number gives you something like the ratio of signal-to-noise in the sample. In this case it's just about 3-to-1 noise. Not great, but not terrible considering that R-number for BABIP itself over the same stretch is just .369 (as a point of comparison, the number for Home Runs is .675). Moreover, the residual is actually more stable than eBABIP itself, which came in at .267. Since I'm not a statistician, I'm not sure what eBABIP's instability really means besides the fact that LD% is itself unstable (and that's a freebie given that eBABIP just is LD% plus a constant).

Anyway, the upshot is there is at least something systematic in how well or poorly eBABIP predicts BABIP for a given player. And if that's so, it's then reasonable to infer that there are certain "types" of players for whom the eBABIP method is particularly accurate, and others for whom it is not.

So I took the absolute value of the residuals--the distance from the value zero, a perfect prediction--and ranked these by percentile. I then ran the the means, std. deviations, skews, etc. on the lower quartile (the most accurate eBABIP predictions) and the upper quartile (the least accurate eBABIP predictions). The descriptive statistics are here and here. Two things stood out.

1) The "good predictor" group are better hitters than the "bad predictor" group. The AVG/OBP/SLG for the former is .269/.336/.429, compared to .262/.326/.411 for the latter. The good predictors also walked more (BB/PA) and struck out less (SO/PA).

2) The good predictor group played more than the bad predictor group, averaging more games (111 vs. 84), plate appearances (430 vs. 304) and at-bats (383 vs. 271).

Surely (1) and (2) are related. Good hitters get more playing time than bad ones, and the smart money says that, for what we're interested in, the causal chain goes from (2) --> (1). In other words, it isn't that eBABIP fails for poor hitters, but that it fails for smaller sample sizes. (Though, when you take into account that the good predictors are almost a year older on average, it becomes a marginally trickier to guess which variable is doing most of the work, since older players tend to get more PAs than younger ones, regardless of talent). I should also say that the good predictors averaged higher VORP (15) than the bad predictors (8.92), which nicely parallels (1) and (2) since VORP takes both production and playing time into account.

So eBABIP can't get a handle on young, part-time and/or poor hitters, kinda sorta. None of this is really revelatory, nor even that helpful, in part because there is more work to do. Tomorrow, or when I get around to it, I'll run regressions between eBABIP and BABIP at different OPS and PA thresholds just to get a better idea of where on the output/playing time spectrums eBABIP does its best and worst work. I'll also run some numbers on players whose BABIPs were subsantially over- and under-estimated by eBABIP. Basically, instead of worrying about the absolute accuracy of eBABIP (as measured by the absolute value of the residual) I'll be looking at whether there is any pattern to the way eBABIP shortchanges some player's output and overblows others'.

But that's tomorrow.

Tuesday, January 22, 2008

Snake Oil Salesmen

Last year, the Arizona Diamondbacks won 90 games and the NL West with a mix of young talent (Conor Jackson, Mark Reynolds, Chris Young) and peak-aged regulars having career years (Eric Byrnes, Orlando Hudson), plus another solid season from Brandon Webb (good for an ERA+ of 156) and innings ably ate by Doug Davis and Livan Hernandez. In the postseason, the D-Backs looked pretty good running over the Cubs in the divisional series, and not terrible having the favor returned by the pick-of-destiny Rockies in the NLCS.

But to the statheads, none of this mattered as much as how definitively they beat their Pythagorean Projection. By +11 games, to be precise.

As the Snakes put up a 46-35 record to open the season (while being badly outscored) baseball analysts were falling all over themselves to explain, or explain away, this beating of the odds. Sal Baxamusa (whose real name used to be Sal Baxamusamusabaxa) at Hardball Times took this as an opportunity to rant.

When reading Sally's beef, I couldn't help but think about Dustin Hoffman's tantrums in Rain Man, especially the scene in which he mumbles "397 toothpicks, defnly defnly 397 toothpicks" over and over while gently rocking himself. Nothing upsets a left-brainer more than an outlier.

Baxamusa picks on guys like Jim Rosenthal, who made the seemingly coherent observation that the Diamondbacks tend to win their close games and lose in blowouts. Others were so audacious as to suggest that the relevant factor might be conscientious leveraging of a quality bullpen.

Mere superstition, cried the grad student. Cast it off like Biblical Creationism, the Female Orgasm, and so many other untenable myths. You see, for the sabermetrically devout, there are only numbers and noise. Any variance in independent variable X not explained by the best predictor variable(s) Y must be random. Never mind that Pizza Cutter's study at MVN (which the guy cites!) shows that the Pythagorean residuals--the difference between the winning percentage predicted by the model and the actual winning percentage, basically a measure of over/under performance--is correlated rather strongly with several of the factors Baxamusa calls "rationalizations": winning% in one-run games, offensive consistency and inconsistenty in run prevention.

Now, maybe Sal's point is just that these factors aren't representative of particular athletic or managerial "skills". But it seems that the first and last (performance in close games and inconsistency in run prevention) are rather closely related to bullpen leverage. An optimally levered bullpen is one in which the best arms pitch in the situations that have the highest impact on the outcome of the game. This is fairly obvious. What maybe isn't so obvious is that the opposite holds as well: an optimally levered bullpen is one in which the worst arms pitch in the situations with the least effect on the outcome of the game. In other words, mopup duty in blowouts.

And then there's offensive consistency. All offensive consistency means in this case is that variance in runs scored per game is clustered tightly around the mean. There are infinitely many ways to for this to happen, most of which are either random or at least unintentional and thus don't speak to any discernible "skill". Some ways of producing consistency--like managers using consistent lineups and batting orders, or GMs putting a premium on non-streaky hitters, are intentional but still don't necessarily reflect a skill (the idea of lineup/batting order optimization is to score as many runs as possible given a certain group of hitters, not to score the same number of runs as often as possible).

In the present case, it doesn't matter to me whether the D-backs offensive consistency was reflective of a skill, I'm only interested in whether it was reflective of something besides randomness. That is, whether the outcomes that account for the error between the D-Backs projected and actual wins record were more the result of choices or chance. Even the best play-by-play data can't successfully quantify all the choices made within an organization over 162 games. But statistical analysts often make the mistaking randonmness with unmeasurability.

That's all for now.

Monday, January 21, 2008

Exit Sandman?

Is Mariano Rivera Slipping?

Mariano Rivera's late inning dominance has been as crucial to the Yankees' success over the last ten years as any single factor. But perhaps more impressive still is the consistency with which he has gone about it. In the four rate statistics believed to be more or less defense-independent—strikeout rate, walk rate, isolated slugging, and ground ball to fly ball ratio—and even in the notoriously defense-dependent or “luck” driven batting average on balls in play (BABIP), Rivera has been remarkably dependable. Most of the variation clusters closely around the mean: Rivera usually strikes out between 18 and 25 of every 100 batters he faces, walks between 4 and 7, and gives up 2 to 3 times as many ground balls as fly balls. Even in the metrics where there is a noticeable skew, it tends to be toward superior performances.

But in 2007, Rivera had what many regarded as an “off year”. Indeed, across both conventional and unconventional stats, he ranked as no more than an average closer. Of the thirty-six pitchers in baseball last year with at least 10 saves, Rivera fared thusly:

He faced more batters and gave up more hits (many of them line drives, more on this later) than almost any other closer in baseball. When the same thirty-six closers are given standardized “scores” that measure their value over or under the positional average, Rivera comes in at a measly .40 above the mean (by comparison Takashi Saito and J.J. Putz topped the list at 10.69 and 7.28 above the mean, respectively). Unsurprisingly, Rivera also saw his ERA balloon to 3.15, the highest since his rookie season (and the first time its been north of 2.00 since 2002).

The most popular explanations of Rivera's down performance focused on two factors: age and misuse. The age argument seems unlikely on the face of it. While it's true that at 37, Rivera was the fourth oldest closer in baseball last season—behind Todd Jones, Trevor Hoffman, and Bob Wickman—it's also true that six of the top ten closers last season were 30 or over (including Hoffman at 40, and Saito at 37). When power pitchers slip, the signs usually start to appear around the age of 32. But it was at that age that Rivera's most dominant five year stretch began (Between 2002 and 2006, opponents hit just .213 and slugged just .285 off Rivera, while he averaged over 40 saves per year with a 1.83 ERA and a WHIP of .98). While age eventually becomes a factor for any major league pitcher, starters tend to collapse more rapidly than relievers, and Rivera's overall stability and dominance through his mid-thirties provides little reason to think his decline phase will be anything more than gradual.

The misuse argument is a bit harder to assess. Was Rivera left in too long in some games? Perhaps. When pitching in the eighth or ninth inning, opponents hit just .250/.291/.337 off Rivera. In extra innings of work the numbers went to .227/.320/.500. But these increases are normal signs of fatigue for any pitcher, and the only 2007 figure substantially disparate from Rivera's career splits in those situations is in extra-inning slugging percentage (Rivera's career SLG-against in those situations is .340). And this is hardly unusual, as over a short period of time even one extra-inning home run can drastically inflate SLG (especially since closers tend to leave and/or games tend to end after extra-inning shots).

Was Rivera not used regularly enough? Was he over- and/or under-rested for significant stretches, and did this effect his performance? Again, there is no compelling pattern here. In 2007 Rivera pitched his poorest on zero days rest (.294/.333/.451 against, with an ERA of 3.38), which is normal enough, but his ERA ballooned to 5.27 on three days of rest, before going back down with each subsequent day. His career splits are for the most part tightly clustered, and where there is substantial difference it appears basically random: Rivera posts his best career ERAs on two, three and five days of rest, and his worst on one, four, and six-plus days of rest. None of this supports the case for improper usage.

Did Rivera under-perform in non-save situations, as the common bit of closer psychology predicts? Somewhat. Rivera's ERA for the 30 appearances in which he received 'no decisions' in 2007 was 3.30, higher than his career mark of 2.24. But Rivera's ERA was up from career averages across the board—from a ridiculous 0.68 in saves to a slightly less ridiculous 1.06, and from 16.17 to 24.30 in losses—so it is hard to pin the change to pitching to the score. It is also important to note that earned run average is a flawed measure of the underlying quality of a pitcher's performance, especially for relievers, for whom one or two bad outings can be unfairly distorting.

So if there is no smoking gun that points to age or misuse, what exactly caused Rivera's poor showing in 2007? The likeliest answer is luck. Pure, dumb luck. More on why next post.

Sunday, January 20, 2008

Bang for the Buck

Doug Pappas, the late, great former chair of SABR's "Business of Baseball" committee, created a formula for determining the marginal cost of a win, in terms of payroll dollars spent, for a given MLB team. He started by determining the minimum payroll required to field a team of "replacement level" players (essentially, what it costs to fill a 40-man roster at league minimum) and estimated the number of games that such a team could expect to win (Pappas set this at 49 games). The marginal cost per win was then the ratio of payroll dollars above minimum to wins above minimum.

Using a minimum payroll of $15.2 million (literally just the league minimum of $380,000*40) and Pappas' benchmark of 49 wins, here is how the numbers break down for 2007.

As you can tell, I've highlighted the playoff teams and made some other modifications to the raw numbers. For one thing, I adjusted the Red Sox and Yankees numbers to reflect the Competitive Balance or "Luxury" tax, and just for grins I estimated the Yankees payroll minus Roger Clemens (assuming Clemens' starts would have been given to a "replacement level" player like Ian Kennedy; replacement level is in scare-quotes for obvious reasons).

A few things are readily apparent here. The Yankees spent an awful lot of money on 94 wins and a first-round playoff elimination. A fate made more embarrasing by the fact that the team that knocked the Yanks out shared the best record in baseball with the Red Sox at a third of what it cost John Henry.

Even though the Yankees look like the biggest wastrels in baseball, they at least got something for the money. Not so for the Orioles, White Sox and especially the Giants. On the bright side, Bonds' home run chase easily brought in enough additional revenue to offset his $15.5 million salary and 5.8 WARP1 performance, especially when you consider that Bonds only charged the Giants about $3 million for each of the wins he contributed, well under their regular marginal costs.

Things are even more interesting at the top of the chart. The Florida franchises once again justify their continued existence with efficiency instead of wins. The Marlins in particular demonstrate their excellent understanding of both the Wins Curve and the Success Cycle, and one begins to understand how a team could collect two World Championships in ten years despite an organizational ethos that kills a little part of the baseball's soul each year.

The efficiency of the Indians and Diamondbacks is just one of the many reasons I think each is the best run franchise in its league (more on that in a future post).

The biggest macro lesson to be learned is that the market for wins is still incredibly inefficient, though some teams are getting better at exploiting this inefficiency. Here's the scatter-dot regression of average wins versus payroll from 2002-2006, courtesy of Dan Fox at BP:

So over the five years preceding last season, payroll explained about 43% of a team's success. Compare that to the regression I did with this year's data:

Last year, only 21% of a team's fate was determined by its financial resources. Good time to be a Rays fan, huh?

Friday, January 18, 2008

A Man with a Plan

How Mike Lowell pulled his way into a big fat contract.

I was looking at some Balls in Play numbers for Mike Lowell, courtesy of Dan Fox's BIPChart 2.5.

In my first post, I talked about the intimate link between Lowell's 2007 success and the Green Monster. Here's what Lowell did on balls in the air to left versus the right-handed league average for 2003-2007:

On fly balls to left (PERCENT/AVG/SLG):

League: 28%/.443/1.455

Lowell: 43%/.425/1.238

On line drives to left:

League: 42.7%/.725/1.100

Lowell: 54.3%/.807/1.105 on liners.

The numbers make it look like the Monster helped a couple of line drives turn into hits for Lowell, but that it actually hurt him a little on fly balls. What gives?

Well, these numbers don't give the whole picture. For one thing, it's highly likely that the home/road splits would look a lot different from these aggregates, based on nothing more than the visual evidence of the spray charts (all but two of Lowell's extra-base hits came in left and left-center). Too bad there's no data on this that I can find (any volunteers?).

But way more importantly, one has to remember that Lowell is basically a league average hitter (.280 career AVG including '07), and that he didn't see dramatic improvements in his LD% from '05 to '07 (18.3%, 21.1%, and 20% respectively.)

The thing that did change dramatically was both the raw number and the percentage of balls he put in the air to left. His worst year, '05 in Florida, he hit flyballs to left at a league average rate and was rewarded with a miserable .255 cBA (contact batting average) there. He hit slightly more LDs than average to left, but again, was rewarded with a below average .700 cBA. Now that was an unlucky year no doubt, but still.

Fast forward to '07. If Lowell had hit the same total number of fly balls and line drives, but with the spray chart of a league average rightie, you'd expect him to hit 52 flyballs to left; he actually hit 80. You'd expect him to hit 44 line drives to left, he actually hit 57. In other words, Lowell hit a net total of 41 more balls to left than average. The two years before that?

2006
FB(expected/actual): 45/60, LD(exp./act.): 50/54, Net: +19

2005
FB: 48/51, LD: 39/40, Net: +4

It's true that Lowell has always been a pull hitter (numbers courtesty of Sean Campion).

The questions is, is Lowell's natural swing just a perfect fit for Fenway? Or is this a man with a plan?

Thursday, January 17, 2008

What Sabermetrics Are Not

Marc Normandin: Sabermetrician or Frenchman?

The folks over at Baseball Prospectus are the standard bearers of the Bill James School of Baseball Analysis. They've been doing it longer and better than just about anybody in town. And in a world where baseball journalism still trades, for the most part, in the kind of vague, subjective"gut feel" nonsense and loud, colorful metaphors of PTI or Mike and the Mad Dog, its nice to know that BP is out there with a pile of data and a little rigor, takin' er easy for all us sinners.

Unfortunately, as BP has become more successful and therefore expanded its coverage, it has taken on a bit of dead wood. This is only natural. Expand the league enough and performance averages dip--its why people talk about contracting the Rays or going back to the four-man rotation.

In BP's case it's Marc Normandin, a college kid from Massachusetts who through some combination of fellatio and nepotism managed to land a gig covering the "Fantasy Beat." You'd have to slog through a bit of his stuff--which manages to be both overly technical and shallow, go figure--to get the full effect of his mind-dulling mediocrity, but I can give you a taste.

In his recent "Third Base Review", Normandin talks about valuating the Red Sox' Mike Lowell for the '08 season. He says:

"Despite a pedestrian liner rate of 18.1 percent, Lowell's BABIP was .342, well above the league average and his expected BABIP of .301. Adjusting his line for this, Lowell should have hit around .283/.337/.460, which looks a lot like his 2006 performance. Let's also not forget that Lowell hit .373/.418/.575 at Fenway last year, and just .276/.339/.428 on the road. If you draft him, you might want a backup plan for road games. This also means you don't want to draft him too early, since he can't handle the position for you well enough on his own."

Ok, Marc, so Mike Lowell hit above his BABIP expectations (here using the quick and dirty .120 + LD% formula), and we should expect him to regress to the mean. But then you mention his home/road splits are out of whack...Any thought to the idea that there is something besides blind luck linking the two?

One obvious place to start is to note that the home/road splits in AVG/OBP/SLG are pretty tightly linked with his BABIP splits. Lowell hit .382 on balls in play at home versus just .293 on the road. That is, based on the "expected BABIP" he actually got cheated out of some hits on the road. Now, a curious mind might wonder why this should be so. If Normandin had ever, say, watched Lowell play a game in his life (he's allegedly a Sox fan), a moment's reflection might have resulted in a little white light--or perhaps a large, forest green wall--appearing somewhere in the imagination centers of his cortex.

Take a look at Lowell's extra base hits at Fenway Park:

You can find Lowell's complete spray charts here.

Notice anything interesting? Me either.

I should point out that I'm not the first, or the twelfth, guy to make this point. It's been a baseball blog staple for as long as Lowell's been doing it. Nor is my idea necessarily to skewer Normandin (at least 80% of my vitriol is sour grapes). Rather, I wanted to use this first post to suggest what this blog will NOT be.

Let me explain. Normandin essentially recommends Lowell as a "sell" because he assumes that the expected BABIP model is a good one and that divergences from it are the result of noise or "luck."

Yes, adding .120 to LEAGUE AVERAGE LD% does give something close to LEAGUE AVERAGE BABIP. But drawing conclusions from this about individual players is a pretty pedestrian instance of the ecological fallacy. (I'll do some more work later that shows that the model only works well for hitters with certain other tendencies.) But for the moment lets focus on the second assumption: that divergences from past, mean or "expected" performance levels are due mostly to luck.

Baseball history has shown this is an OK assumption to make--all other things being equal. But it didn't take much imagination to see that all other things weren't equal in the Mike Lowell case. The idea among casual sabermetricians (and, occasionally, professional ones) seems to be that a given player is essentially his Strat-O-Matic card--a pie sliced up into plate-appearance outcomes "weighted" to reflect that players tendencies.

But the weights can change, not only as context changes and players fluctuate around their personal peaks and means, but also because they make adjustments. They have gameplans. They react. Mike Lowell didn't hit .320 because of noise in the system, or because of the confluence of luck. Mike Lowell hit .320 by shortening his swing and uppercutting everything toward the Monster in Left. Mike Lowell knows it, opposing pitchers know it, every two-bit baseball blogger in the world knows it. The only person who doesn't seem to know it is Marc Normandin.

Back of the Envelope