Tuesday, March 14, 2006

The Stats Geek and lineup tools

The Stats Geek's column today deals with using a lineup tool to predict the best possible offense for the Pirates in 2006. I'd assume the tool he used can be found here. Since a tool like that can only take numbers like OBP and SLG into consideration, Sean Casey is viewed as the best leadoff hitter. Playing around with the suggestion a bit, the Stats Geek decides on Duffy, Casey, Bay, Burnitz, Randa, Castillo, Doumit, JWilson as the Pirates best lineup, which I can't really argue with (I might suggest dropping Randa to two and bumping Casey, Bay, and Burnitz all back a slot to add some speed at the top, but the real truth is that Randa probably isn't much faster than Casey at this stage of his life).

Anyways, this gives me an opportunity to do something I had meant to do a couple weeks ago, but never got around to. Specifically, that is use Cyril Morong and Ken Arneson's Lineup Analysis tool that was posted on Baseball Musings a few weeks back to look at the 2006 Pirates lineup. This tool gives the best lineup for a group of players, just like the one O'Neill used, but it also gives run expectancy. The first lineup I'll use is the Stats Geek's ideal lineup, using 2006 ZiPS projections for SLG and OBP. As a disclaimer, yes, I know just how theoretical all this is. I'm just using it as a statistical way to measure just how certain players affect the lineup. Since the tool requires nine batters, I'll use Zach Duke's batting stats from last year. The tool projects that lineup out to 4.327 runs per game or 721 runs over the course of the year, or 41 runs more than last year. Assuming we give up the same amount of runs as last year (a giant assumption, I know, but it's the best I have at the moment for a reference point), 769, that projects out to a 76-86 record, using Baseball Reference's Pythagorean W/L formula (we actually projected out to 72 wins last year but underperformed a bit, if you can believe that).

Since we can be fairly sure that Tracy will stick to a lineup very similar to the one the Stats Geek used, let's have fun with one simple change. By adding Craig Wilson's projections into the four spot for Burnitz, the tool ups our run expectation for 2006 to 752, which is dangerously close to equally the number of runs we gave up last year (projects out to 79 wins). Moving Randa up to the 2 slot and bumping Casey to 3, Bay to 4, and CWilson to 5 gives us 758 runs, while replacing Randa with Freddy Sanchez costs us four runs in that lineup, putting us at 754.

Granted, as I said above, nothing that I did here is set in stone. The lineup tool is certainly not perfect, nor are the ZiPS projections. I hate to beat a dead horse, but this helps confirm what we already suspected, Craig Wilson's bat is worth 31 runs more than Jeromy Burnitz's, while Freddy Sanchez's bat only costs us four compared to Joe Randa. It could be argued that Freddy Sanchez will save four runs in the field next year compared to Joe Randa, but I highly doubt Jeromy Burnitz will be worth 31 runs in right field more than Craig Wilson.

Still, some of the numbers are promising. All of the lineups I used predicted more runs in 2006 than we scored (or were projected to score) in 2005. It's also possible that guys like Castillo and even Jack Wilson out play their predictions (it's also possible that people underperform, but I'm aiming for positive here). And even as dire as our pitching situation looks at the moment, we had four people last year that made at least 20 starts and had ERAs of between 4.90 and 5.85 (Redman 4.90, Fogg 5.05, Wells 5.09, Perez 5.85). And Jose Mesa and Daryle Ward and Tike Redman are completely gone. I suppose all I'm saying is that it's possible that things aren't as bleak as some of us (myself included especially) make them out to me.

EDIT (9:26 PM): I should point out that this all assumes that we use that exact same starting lineup for all 162 games, which doesn't have a snowball's chance in hell of happening. The tool gives its prediction in runs per game which makes the difference seem trivial, when multiplied out over a long period of time it's much more noticeable. For a good example, when the tool is used to analyze a typical starting line up from last year, Lawton, Sanchez, Bay, Ward, Mackowiak, Castillo, Cota, JWilson, Duke, it gives us 698 runs. We actually scored 680, but given all of the roster and lineup shuffling we did last year, 698 is fairly close. For a more accurate prediction of how many runs we'll score, it may be a fair exercise to subtract 20. I didn't, simply because no one can say for sure who will get hurt, who will outperform expectations, etc. and I was using the run totals more for comparison's sake than anything (with pythagorean win totals thrown in for fun, of course).