Wednesday, August 31, 2016

The Ballad of Jedd Gyorko

Come and listen to my story 'bout a man named Jedd,
Poor Mountaineer, tryin' to keep his team ahead...

Jedd Gyorko hit a solo homerun in the 6th inning last night to give Adam Wainwright and the Cardinals a 1-0 lead in Milwaukee against the Brewers. The Cards would go on to win 2-1 in ten innings.

More often than not, my Fanduel spreadsheet has spit out Gyorko's name to plug in to my line-up (I had him in my line-ups for both the main and 8 pm slates last night).

Following are random facts about the life and career of Jedd Lindon Gyorko:

- He was born in Morgantown, West Virginia in 1988.

- At University High School in Morgantown, Jedd was All-State three times in baseball (hitting 40 homeruns with 231 RBI). Despite standing only 5'10", he was named first team All-State in basketball his senior year after averaging 18.2 PPG.

- At West Virginia University, Gyorko set or tied school records for batting average (.404), doubles (73), homeruns (35), and extra-base hits (113). He was a three-time All-Big East selection, making the first team twice, and won the 2010 Brooks Wallace Award as the best shortstop in NCAA Division I his junior year.

- Following his junior year, Gyorko entered the amateur draft. He was selected by the San Diego Padres in the 2nd round.

- Gyorko made the Padres' opening day roster in 2013. He played 125 games at second and third base his rookie year, batting .249 with 23 homeruns.

- On May 9th, 2014, he hit two homeruns, including a grand slam, and drove in six runs against Jose Fernandez as the Padres routed the Marlins, 10-1.

- He missed most of June and July in 2014 with plantar fasciitis, and finished the year with a disappointing .210 average and 10 homeruns in 111 games.

- Following the 2015 season, in which Gyorko batted .247 with 16 homeruns in 128 games, the Padres traded him with cash to the Cardinals for outfielder Jon Jay.

- Gyorko's homerun last night in Milwaukee gave him 24 on the year - a new career high. His 24 homeruns in just 305 at-bats - 12.7 homers per at-bat - would be the best ratio in the league if he had enough plate appearances to qualify.

- Despite having the 2nd-most homeruns on the team (one less than Brandon Moss), Gyorko has mostly been the Cardinals' fifth infielder this year - playing 40 games at second base, 33 games at third, 10 games at first and 10 games at shortstop.

- Of players appearing in at least 10 games at all four infield positions, Gyorko ranks second all-time in single-season homeruns to Mark Bellhorn, who hit 27 for the 2002 Cubs. (Rich Aurilia, who hit 23 for the 2006 Reds, is the only other such player to have hit more than 12.)


Tuesday, August 30, 2016

The Odds Ratio Method

So the other day I wrote about how I break projections down into eight binary components. I do this for everything: batter projections, pitcher projections, league averages, and all kinds of splits.

Then for each set of components except for batters, I'd convert them to a factor of the league average, where 1 equals league average.

The first component is $BB, which = (BB + HBP) / PA.

Jered Weaver has a $BB factor of about .86 (7.3% / 8.6%). Angel Stadium in Anaheim has a $BB factor of .98 (yes, ballparks can affect walk rates, too). Left-handed batters facing right-handed pitchers have a $BB factor of 1.02, and batters on the road have a $BB factor of .96.

Then all those factors are multiplied by the batter's rate to find the probability for the match-up.

Joey Votto has a $BB of 19%. So when he faces Jered Weaver at Angel Stadium tonight, we would expect him to walk in 16% of his plate appearances against him (19% * .86 * .98 * 1.02 * .96 = 16%).

OK. Let's try another example: the Orioles' Chris Davis vs. the Cubs' Aroldis Chapman. Davis' $BB rate is 14%, and Chapman's $BB factor is 1.29. So the $BB of their matchup would be 14% * 1.29 = 18%. We'll ignore other factors this time and leave it at that. 18% seems pretty reasonable.

Next, $SO. Chris Davis' $SO rate is 38%, one of the highest in the majors for a regular. Chapman has a $SO factor of 1.94 - nearly twice the league average. So the $SO for their match-up would be 38% * 1.94 = 74%. I'm sure a Davis/Chapman match-up would produce a lot of whiffs, but 74% seems a little extreme. And that's without multiplying in the lefty-lefty platoon factor of 1.11, which brings the $SO rate for their match-up up to 82%.

Now imagine a hitter even more prone to strikeouts - a player with a 60% $SO rate against the MLB average. Against Chapman, his $SO rate would be.....116%! Obviously, no batter, no matter how inept, can strike out more times than he has opportunities. Anybody - if I went up there against Chapman, if your grandma did - would have a less than 100% chance of striking out, even if it was 99.9%.

So my old system was breaking down at the extremes. High-strikeout batters facing high-strikeout pitchers were being underrated. Power hitters facing homer-prone pitchers were being overrated.

Back in the '80s, Bill James introduced the Log5 method to find the probability that Team A will defeat Team B, based on Team A's and Team B's winning percentages. Later, with the help of a colleague, he expanded on the formula to account for an average other than .500 (an average winning percentage), so the formula could be used to find the probability of any kind of match-up - batting average, on-base percentage, free throw percentage - where the league average is something other than .500.

Bill recently wrote about the method again on his website (again, subscription required), going into detailed description (as only Bill James can) of the logic behind the method and the steps involved in figuring it.

But as Tom Tango explained, the Odds Ratio method gives you identical results.

" can use the Odds Ratio method for any mean.  For example, assume the league OBP is .333, you have a hitter who is .400 and the pitcher is .250.  What’s the resulting OBP?

"Odds(H) = .400/.600 = .667
Odds(P) = .250/.750 = .333
Odds(L) = .333/.667 = .500

"Odds = Odds(H) * Odds(P) / Odds(L)
= .667*.333/.500=.444

"If the Odds are .444 safe to 1 out, the the Rate is .444/(.444+1) = .308"

Let's try it with our Davis/Chapman matchup.

Davis' $SO rate is 38%, Chapman's is 43%, and the league average is 22%.

Odds(H) = .38/.62 = .613
Odds(P) = .43/.57 = .754
Odds(L) = .22/.78 = .282

Odds = Odds(H) * Odds(P) / Odds(L)
= .613 * .754 / .282 = 1.639

The odds of a strikeout are 1.639 to 1, or 1.639 / (1.639 + 1) = 62%.

62% is still very high, but more sane than 74%.

Now let's try our theoretical 60% whiffer.

Odds(H) = .60/.40 = 1.5
Odds(P) = .43/.57 = .754
Odds(L) = .22/.78 = .282

= 1.5 * .754 / .282 = 4.01

4.01 / (4.01 + 1) = 80%

Even a bad hitter (think an average college player, maybe) would run into one once every five times up against Chapman.

Tango goes on to expand the formula to include different league averages for the hitter and the pitcher...and honestly, I don't understand it:

"The full equation is:
Odds(matchup) Odds(H) * Odds(P)
----------------- = -----------------------
Odds(environment) Odds(envH) * Odds(envP)

"So, you have a hitter with an OBP of .400 in a league of .300 facing a pitcher with an OBP of .250 in a league of .350, and they are both playing in a league (or park) where the OBP is expected to be .380 for the league average player.  What’s the resulting OBP?

"Odds(matchup) (.400/.600) * (.250/.750)
------------- = -------------------------
(.380/.620) (.300/.700) * (.350/.650)

"Odds(matchup) = .590
Matchup OBP = .590/1.590 = .371"

So if anyone more schooled in math than I am (or who can decipher the above equation) is reading Tango? (edit: I've since figured this out - Tango just typed the formula in a confusing way. The odds(environment) - the .380/.620 - is not in the denominator, as it looks like in the formula.)

Anyways, Matt Haechrel expresses the modified Log5 formula like this (in an article on


...which is the formula I currently use in my spreadsheet. x = batter, y = pitcher, z = league. Gee, maybe I should test it to make sure it gives the same result as Tango's Odds Ratio Method.

Plugging the $SO rates for Davis and Chapman into it:

((.38*.43)/.22)/(((.38*.43)/.22)+((1-.38)(1-.43))/(1-.22)) =

(.1634/.22)/((.1634/.22)+(.62*.57)/(.78)) =

.7427 / (.7427 + .4531) = .7427 / 1.1958 = 62%

Ok, good. They're identical.

So from there, I multiply this match-up rate by all the other factors (platoon, park, etc.) like I did before. So with platoon factor multiplied in, the Davis/Chapman $SO is 62% * 1.11 = 69%.

This probably still isn't right. You're probably supposed to include the other factors in the rates for the batter or pitcher. Tango commented on Bill's Log5 article:

"Now, the power of the Odds Ratio form is that you can extend it to include other variables. Say you want to include the Home field advantage. That's a .540 record for the average team, or .54 wins per .46 losses, or 1.17 wins per loss.

"A .6 v .4 team at home:
1.5 x 1.17 / .67 = 2.63 wins per 1 loss, or .725 win%

"A .6 v .4 team on road:
1.5 / (.67 x 1.17) = 1.92 wins per 1 loss, or .658 win%

"You can also use it for things other than a .500 baseline, and include even more things like batter v pitcher to include home field and platoon advantage, etc. It's a bit more complex to account for the non-.500 baseline, but it flows right in once you see it."

I don't see it...not yet anyway. But this method is more right than my old way of doing it.


Sunday, August 28, 2016

The Votto Interviews, part three

"In base-ball, as in many things, timing is everything. My father once told me that it is not enough for a man to be mighty; he must also come in the proper season, and through the proper channel. I thank whatever Divine Force guided my steps down from cloud-shrouded Canada to the Queen City, birthplace of professional sport, for summer is my season; and my channel, the glorious game of Ball. My approach is as follows: before ever visiting an opposing pitcher in the flesh, I take great pains to acquaint myself with his arsenal of pitches, his mannerisms, his favorite leisurely pursuits away from the diamond, his secret longings, his deepest sources of pain, and regret, and humiliation, etc. Indeed, my off days frequently find me alone in my parlor (or lodging if on the road), surrounded by likenesses of the pitchers I will meet in the coming home-stand or road-trip, trying to glean what I can from their very countenance. Then, when I meet a pitcher in a game, I observe some pitches, to the number of one-less-than-thirty, without offering my bat to a single one. During these instances, I observe not only the pitches, but study the expression of the pitcher also, his disappointment if he expected me to offer at his pitch, his relief if he was in hopes I would not, his anticipation when I pretend to swing, but do not, etc. Then, after I have examined one-less-than-thirty pitches, and I feel confident that the pitcher's very soul is bared to me, and it is often in the late innings of the game and the Reds behind a run or two, I fortify myself at the plate and resolve to strike a pleasing pitch into the field of play, and hope that good fortune is mine.

"I often find that my approach has a wondrous affect on the legions of river-people who turn out to witness our contests at the ball-park. My stubborn lack of engagement in the batter's box, throughout the evening, incites the greatest part of them to the most inconceivable howlings of lust and fury, they thinking that I should deliver the winning thrust earlier in the contest, and growing frantic at my delay. Their emotions may be readily imagined, but needless to say, when in the latter innings I finally strike true, and send the scoring balance into our favor, they erupt into an orgy of spirited huzzahing that continues uninterrupted until the ball-park is darkened and our loyal groundskeepers drive the joyful mob back into the streets; they winding themselves into such a pitch as they return to their tenements that, in their conversations, the Reds have won the pennant already and will soon be playing in the World Series. This is pleasing to those of us with more serious thoughts, for several gentlemen of our club are sickly, others poor and inconsequential batsmen, and our enemies in Milwaukee and St. Louis wax strong, etc."

- Joey Votto, Sports Illustrated Kids, April 2014 issue

Binary Components

If a hitter with a .320 batting average faces a pitcher who allows a .230 batting average in a league with a .260 batting average, what would the batting average of their match-up be?

The way I solved this on my old spreadsheet (that I built to win money on fanduel) was to take the batter's rate and convert everything else to a factor of the league-average rate, and then multiply the batter rate by those factors - the park factor, the platoon factor, the home field advantage factor, and the opposing pitcher factor.

The pitcher in the example above allows a batting average of .230, or 88.5% of the league average (.230 / .260), so the opposing pitcher factor is .885. Multiply that by the batter's rate (.320), and you get the batting average for the match-up: .283. A .320 hitter would hit .283 against a .230 pitcher when the league average is .260 (and assuming all other factors are 1).

That's how I would have figured it on my old spreadsheet. Actually, there's a much better way (more on that later).

And that's if I figured batting average. But I didn't, even on my old spreadsheet. Never cared about it.

Batting average is the rate of hits - any kind of hit - per plate appearances that aren't bases on balls, hit batsmen, sacrifices flies, or sacrifice bunts. That's not a very intuitive way to look at things.

Back around 2000 or so, Voros McCracken turned the sabermetric community upside-down with his DIPS theory (Defense-Independent Pitching Statistics), which proposed that a pitcher has no control over hits in play. If contact is made, and it's not a homerun, the outcome is entirely determined by the batter and the fielders.

Voros overstated his case slightly - pitchers have some ability to limit the amount of hits they give up on balls in play, as other sabermatricians quickly found out, but it takes a long time (years) to separate a pitcher's true BABIP (batting average on balls in play) ability from random variance. Therefore a pitcher's rates of walks, strikeouts, and homeruns allowed are much more reliable than their rates of other hits allowed and in-play outs.

Bill James mentioned Voros in a comment in his New Historical Abstract, which was published soon after. Like nearly all sabermatricians, Bill James taught me how to think, but recently I've gleaned more knowledge from comments left by Tom Tango in other writers' blogs than I have from entire books written by James.

I was already toying with the idea of separating batting events into individual yes-or-no components, starting with the three true outcomes - the walk, the strikeout, and the homerun - the events entirely determined by the batter and the pitcher. After weeding out the three true outcomes, then you can work your way down to fielding-dependent outcomes, like base hits and extra-base hits.

I was either inspired from James' comment on Voros and DIPS, or some subsequent articles or comments by Bill James or Tom Tango or some other sabermatrician, or a combination of all the above. (Like Bill James, Voros McCracken was hired to consult for the Red Sox in 2003. Unlike James, he left the Sox in '05 and never wrote about baseball again.)

But a comment made by Tango on this Bill James article (subscription required) affirmed my line of thinking:

"When I look at the data, I break it down into binary components, which is a method that I've adopted from Voros.

"For example, Voros would do:
$SO = SO/(PA-BB)
$nonHRH = (H-HR)/(PA-BB-SO-SO) or inplay BA
And you can continue
$2B3B = (2b+3b)/(H-HR)
$3B = 3b / (2b+3b)

"At every step, Voros would remove one component, so that each metric is independent of the others, a very binary tree approach."

Here is my modified version:

1. $BB = (BB + HBP) / PA

Did the pitcher miss the strike zone (or hit the batter), and did the batter lay off? (If so, then go to #7). If not, then...

2. $SO = SO / (PA - BB - HBP)

Did the batter make contact? If so, then...

3. $HR = HR / (PA - BB - SO - HBP)

Did the batter hit a homerun? If not, then...

4. $H = (H - HR) / (PA - HR - BB - SO - HBP)

Did the batter get a base hit? (This is roughly the same as BABIP). If so, then...

5. $XBH = (2B + 3B) / (H - HR)

Did the hit go for extra bases? (If not, go to #7). If so, then...

6. $3B = 3B / (2B + 3B)

Was the extra-base hit a triple?

7. $SA = (SB + CS) / (H - 2B - 3B - HR + BB + HBP)

If the batter is on first, did he attempt to steal? (I know not all steal attempts are from first base, but most of them are - unless Billy Hamilton is involved. Nor does a batter need a hit or a walk or a HBP to reach first, and sometimes when he's on base the next base is blocked. But this is a good approximation of stolen base "opportunities".)

If the batter attempted to steal, then:

8. $SB = SB / (SB + CS)

Was the steal successful?

This accounts for all statistics in Steamer projections for batters. I don't worry about sac flies, sac bunts, reached on errors, GIDPs, etc.

For an example, let's use...who else? Joey Votto.

Here is Steamer's up-to-date projection for Votto, per 600 PA:

Name        AB   H 2B 3B HR  R RBI  BB  SO HBP SB CS  AVG  OBP  SLG
Joey Votto 478 139 28  2 22 84 73 111 120   5 6  4 .290 .426 .496

Plugging those numbers into the binary component formulas:

1. $BB = (BB + HBP) / PA = (111 + 5) / 600 = 19%

2. $SO = SO / (PA - BB - HBP) = 120 / (600 - 111 - 5) = 25%

3. $HR = HR / (PA - BB - SO - HBP) = 22 / (600 - 111 -120 - 5) = 6%

4. $H = (H - HR) / (PA - HR - BB - SO - HBP) = (139 - 22) / (600 - 22 - 111 - 120 - 5) = 34%

5. $XBH = (2B + 3B) / (H - HR) = (28 + 2) / (139 - 22) = 26%

6. $3B = 3B / (2B + 3B) = 2 / (28 + 2) = 7%

7. $SA = (SB + CS) / (H - 2B - 3B - HR + BB + HBP) =  (6 + 4) / (139 - 28 - 2 - 22 + 111 + 5) = 5%

8. $SB = SB / (SB + CS) = 6 / (6 + 4) = 60%

Against average pitching, playing his home games at GABP, Joey Votto will walk or get hit by a pitch in 19% of his plate appearances. If he doesn't walk or get hit by a pitch, he will strike out 25% of the time. When he connects, 6% of his balls hit will leave the yard, and of the ones that stay in the park, 34% will go for hits. 26% of his hits will go for extra bases, and of those, 7% will be triples. When Votto is on first, he will attempt to steal 5% of the time, and he will be successful in 60% of those attempts.

Here are the components for Votto and Billy Hamilton based on their Steamer projections, along with the rates for the 2016 MLB season to-date:

                 $BB $SO $HR  $H $XBH $3B $SA $SB
Votto (proj.)    19% 25%  6% 34%  26%  7%  5% 60%
Hamilton (proj.)  8% 20%  2% 30%  21% 15% 56% 76%
2016 MLB          9% 23%  4% 30%  25% 10%  8% 72%

I think Steamer's stolen base success rates ($SB) are a little pessimistic - Billy Hamilton's actual stolen base success rate in 2016 is 88%; the MLB average of 72% is almost as good as his projected 76%.

Anyways, to find the rates for pitchers, you can just plug in their batting against statistics. Unfortunately, Steamer only shows pitching stats in their projections (innings pitched instead of plate appearances and at bats, and no doubles, triples, stolen bases, etc.)

But we can still figure the first four components from what we have. The formulas are different - I have to estimate batters faced by multiplying innings by 3 to get outs and adding hits and walks:

1. $BB = BB / ((IP * 3) + H + BB)

2. $SO = SO / ((IP * 3) + H)

3. $HR = HR / ((IP * 3) + H - SO)

4. $H = (H - HR) / ((IP * 3) + H - HR - SO)

Here are Aroldis Chapman's Steamer-projected components, along with the MLB rates for 2016:

                $BB $SO $HR  $H
Chapman (proj.)  8% 42%  3% 29%
2016 MLB         8% 23%  4% 29%

Notice the MLB components are slightly different for batting totals and pitching totals due to the different formulas used, but they can easily be put back on the same scale by multiplying each component by the rate for batting totals and dividing by the rate for pitching totals.

The usefulness of this is in solving batter/pitcher matchups. Say Chapman is pitching to Votto, and he doesn't walk him. If Votto strikes out in 25% of non-walk PAs, and Chapman strikes out batters he doesn't walk at a rate of 1.83 (42% / 23%) times the league average, then Votto would strike out in 46% (.25 * 1.83) of his non-walk PAs against Chapman (and that's without multiplying in a platoon factor, which would be over 1 for lefty vs. lefty). You can work your way through each of the first four components this way.

But like I said earlier, this method (converting pitcher components to factors of the league average, and multiplying them by batter components) is how I used to do it. In my next post, I'll talk about why this method doesn't work, and discuss a better method to turn my batter and pitcher components into match-up probabilities - the Odds Ratio Method.

Saturday, August 27, 2016

Hamilton, Cozart, Votto and WAR

Billy Hamilton, Zack Cozart, and Joey Votto, the Reds' usual 1-2-3 in the batting order (when all three are healthy) have been equally valuable so far this year. According to's Wins Above Replacement, they are tied for 3rd-most WAR on the team with 2.5 apiece. (Dan Straily has been the team MVP so far with 3.3 WAR, followed by Adam Duvall with 2.7.)

How is this possible, when Votto has been the best hitter in the league for three months now, whereas, despite improvements, Cozart and Hamilton are still average or below average hitters? Let's break it down.

On the "player value" table for every player ever, shows how many runs above or below average a player is in hitting, baserunning, grounding into double plays, and defense.

First, hitting:

Player   Rbat
Votto     +32
Cozart     -3
Hamilton  -14

So obviously this is Votto's strong suit. Hamilton has gotten better at getting on base this year, but overall he's still a below average hitter, while Cozart is pretty much average. So far, Hamilton is 46 runs less valuable than Votto, but we're just getting started.

Next, baserunning:

Player   Rbat Rbaser subtotal
Votto     +32     -2      +30
Cozart     -3     +1       -2
Hamilton  -14     +9       -5

Here's where Hamilton shines - he's created nine runs more than an average player just with his baserunning exploits. That narrows the gap, but he's still in last and 35 runs behind Votto.

Double plays grounded into:

Player   Rbat Rbaser Rdp subtotal
Votto     +32     -2  -1      +29
Cozart     -3     +1  -1       -3
Hamilton  -14     +9  +2       -3

With his speed, Hamilton is also especially hard to double up. He's now tied with Cozart, 32 runs behind Votto.


Player   Rbat Rbaser Rdp Rfield subtotal
Votto     +32     -2  -1    -12      +17
Hamilton  -14     +9  +2    +13      +10
Cozart     -3     +1  -1    +10       +7

Now things are getting much tighter. While Votto has been Votto with the bat, his formerly solid defense at first base has been abysmal this year (at least according to WAR). Meanwhile, Cozart and Hamilton continue to be Gold Glove-caliber defenders.

But wait, we're not done. WAR also includes a positional adjustment. An average hitter who plays shortstop is more valuable than an average hitter at first base. WAR accounts for this, so players at more offensive positions, like first base, left field, and DH, get a penalty, while players at more defensive positions, like shortstop and catcher, get a bonus.

Player   Rbat Rbaser Rdp Rfield Rpos  RAA
Hamilton  -14     +9  +2    +13   +2  +12
Cozart     -3     +1  -1    +10   +5  +12
Votto     +32     -2  -1    -12   -7  +10

With positional scarcity accounted for, Hamilton and Cozart close the gap and pass Votto, But they're all pretty even - each have been worth 10 or 12 runs more than the average MLB player.

Finally, to find true value, WAR adds eplacement runs to account for the difference between an average player and a "replacement" player. The theory being that if a player went down with an injury, how many runs (and wins) would you be giving up by replacing him with your best option from AAA or waivers? An average player over a full season is much more valuable than two weeks of an above-average player, while a replacement-level player is, by definition, worthless.

Replacement runs is a set number based on a player's playing time. Votto has been in the line-up more often than Hamilton and Cozart this year, so he gets slightly more Replacement Runs. Runs Above Average and Replacement Runs are added together to get Runs Above Replacement, which is then converted to Wins Above Replacement (roughly one win for every 10 runs):

Player   RAA Rrep RAR  WAR
Hamilton +12  +14 +26 +2.5
Cozart   +12  +15 +27 +2.5
Votto    +10  +17 +27 +2.5

I hope my brief explanation of Wins Above Replacement made sense to the uninitiated. If you're interested in a detailed explanation, here's's version.

Friday, August 26, 2016

The Votto Interviews, part two

"At first I was not fond of it; to bat second, when all the mighty batsmen of old - Ty Cobb, Babe Ruth, and my beloved Ted Williams - manned the third position in the batting order, and if not third, than fourth, etc.; and although I am no stranger to uncontrollable rage, on this occasion I do not think I ever felt a greater flow of anger, and it was some time before I was able to master my passion again. Mr. Price, to his credit, ignored my murderous threats and calmly bid me make the best I could of the situation. Finally determining to do so, I reflected on my proclivity for causing my enemies to expend their pitches like so much life-blood, and the admirable effect this would have on Mr. Hamilton, a speedy young gentleman and outfielder from Mississippi with a penchant for thievery on the base-paths, who frequently bats first in our order, should he be fortunate enough to find himself on first base, and me taking up my club and following him to the plate; by giving him ample opportunity to test his legs against the nerves and arms of the unfortunate defenders. Furthermore, Mr. Phillips and Mr. Bruce, and other such spirited gentlemen of our ballclub, though they sell outs less dearly than I (rather would I be bonded to seven years of servitude than part with one in vain), are perhaps more enamored with driving home baserunners; they would have the good fortune to appear at the plate after I have already made ready their way, and frequently with myself on base anxiously awaiting the issue of their contests. When batting third, as I formerly did, I would frequently fortify myself at home plate with two of our three allotted outs already having been expended, and not a friend to be seen on the base-paths among so many villainous faces, they becoming puffed up in their confidence that, even if I emerged victorious, they could eliminate my successor as easily as they had my predecessors, and win the greater battle; thus was my already considerable burden increased. Since our starting nine bat in order, cyclically, the first man following the ninth, etc., I became sensible that I would confront the opposing pitcher just as frequently as I had heretofore, and sometimes more frequently, and never less; these ideas and many others similar led me to a long train of thinking, the result of which was that I began to hold the proposition in a most favorable point of view, and to as warmly espouse the idea as I had formerly opposed it, and to conclude that no strategy our managers could engage in would be of greater consequence, or be of better utility to our ballclub."

- Joey Votto, on being asked to bat second, Cincinnati Enquirer, April 16, 2014.

Votto being Votto

This is quickly turning into a Joey Votto blog. (The other two Votto Interviews are still to come.) Whether it's an admiration for the science of hitting, and Votto's mastery of it, or parodies of his eccentric personality, Votto provides much material.

This is a GREAT breakdown of a game Joey Votto had against Adam Wainwright and the Cardinals on August 2. Votto saw 15 pitches that day. 10 were balls, and Votto swung at zero of them. Of the five strikes he saw, he swung at four.

He was 4-for-4 on the day.

And since it's fangraphs, they show you a GIF of all 15 pitches. :)

Wednesday, August 24, 2016

The Votto Interviews, part one

"The denizens of the Ohio River Valley are a particularly nasty breed of American; nevertheless, as I am contracted and well-paid to produce runs and, indeed, victories for the Cincinnati Base Ball Club, I am resolved to put aside every private view, blot from my conscious mind the howling troglodytes that nightly fill our park, and endeavor to engage only in conduct that results in my reaching base, thus avoiding such conduct that causes me to be put out; to inflict a maximum of fatigue of body and torment of mind upon the opposing pitcher, and to frustrate the designs of he and his co-defenders; for I must either strike my opponent's pitch, sending it beyond the reach of the scurrying defenders, and advance as far around the bases as I dare run; or, failing in that, foul his best offerings until he has delivered such number of balls that the impartial umpire, satisfied, declares me the victor of our little contest and awards me my rightful place on first base; until Mr. Hurdle and his Pittsburghs are vanquished (or whichever club opposes us that day) and I and my fellows are filled with high spirits and huzzahs for the Cincinnati Reds."

- Joey Votto, post-game interview, Great American Ballpark, April 15, 2014.

The St. Louis Cardinals Well-Rounded Mid-American (and Cuban) Home Run Hitting Base Ball Team

The St. Louis Cardinals have hit 174 homeruns this year, the most in the National League.

Brandon Moss, from Loganville, Georgia, has hit 23.

Jedd Gyorko, from Morgantown, West Virginia, has hit 20.

Matt Holliday, from Stillwater, Oklahoma, has hit 19.

Stephen Piscotty, from Pleasanton, California, has hit 18.

Randal Grichuk, from Rosenberg, Texas, has hit 16.

Matt Carpenter, from Missouri City, Texas, has hit 15.

Aledmys Diaz, from Santa Clara, Cuba, has hit 14.

Matt "Big City" Adams, from Philipsburg, Pennsylvania, has hit 12.

Jeremy Hazelbaker, from Muncie, Indiana, has hit 11.

Tommy Pham, from Las Vegas, Nevada, has hit 9.

The rest of the team has hit 17.

The Rockies' good hitting and poor pitching (and poor hitting and good pitching)

Through August 20th, Rockies batters were hitting .300 at Coors Field and .245 on the road.

Through August 20th, Rockies opponents were hitting .300 at Coors Field and .245 at home.

Votto-matic (once again)

(Note: This is my first new post of the blog, after copying over a few posts from old blogs I started and abandoned)

After last night's 2-for-3 in The Reds' 3-0 win over Texas, Joey Votto is batting .312, with an NL-leading .434 on-base percentage.

Leaving out the first 16 games of the season, when he hit .172 with one homerun (so from April 22nd on), Votto is hitting:

.334/.459/.568 (BA/OBP/SLG) in 106 games

Since May 30th:

.383/.500/.629 in 72 games

And since the All-Star Break:

.455/.545/.707, 32 RBI in 36 games


(Originally published in April 2013)

With all the weeping and gnashing of teeth that’s gone on over the Steroid Generation taking an unfair advantage to rewrite the home run record books, it’s easy to forget that other records remained completely out of their grasp.

In 1999, Larry Walker, a terrific hitter playing in an un-humidified Coors Field at the peak of the big-hitting boom, batted .379 for the season. Barring a miracle season (unlikely; Adrian Beltre led Gen-Xers with a .321 average in 2012), this will go down as the highest single-season batting average ever posted by a player of his generation (born 1961-1981). Yet on the list of the highest single-season batting averages of all time, Walker’s 1999 season ranks 90th.

The book Generations profoundly changed my worldview when I read it back in 2007. (Actually, it provided the framework of a worldview; the explanation for observations I had already made in my life up to that point). In the book, authors William Strauss and Neil Howe retell American history as a series of generational biographies. The main thrust of this blog is to apply the theory to another of my great loves: baseball.

But maybe you weren’t affected by the theory like I was. Maybe you don’t believe that every person in a society can be grouped into 20-odd-year-long blocks of birth years according to their common beliefs and behaviors (which were molded by their common age at historical events and trends). Maybe you don’t understand the theory, or maybe you just don’t care either way. Maybe you’re just here because you’re a baseball fan (I hope you are, anyway).

That’s fine, too, because the baseball leader-boards are cluttered, and categorizing every baseball player in history by 20-odd-year-long blocks of birth years will serve to un-clutter them.

“Un-clutter” the leader-boards? Yes, because while twenty-four of the forty-two 50-home run seasons in major league history were accomplished by players born between 1963 and 1980 (the Steroid Generation), all but one of the twenty men who batted .400 for a season were born before 1900, as were all seventeen of the players to hit 25 triples in a season.

Twelve players in big league history stole 100 or more bases a total of nineteen times, and they were all born either from 1858 to 1866 or from 1932 to 1961. But no one born in those two time spans ever slugged .700 for a season, even though sixteen players have done it a total of 35 times. All but one of them were born in the time spans of 1895-1920 or 1963-1968; all twelve men who eclipsed 1.200 in OPS for a season were also born in those 32 years.

Pitchers have posted ERAs under 1.50 a total of 36 times, but only Bob Gibson was born after 1892. No pitcher born after 1881 ever won 40 games in a season (only one pitcher born after 1910 ever won 30 games); similarly, no pitcher born before 1953 ever saved 40 games. Even 17 of the top 20 seasons of Wins Above Replacement were the performances of pitchers born between 1849 and 1871. The other three were by Walter Johnson (two) and Babe Ruth, both of whom were born in the 19th Century.

The Steroid Generation took advantage of banned (but under-tested) substances and small ballparks to dominate the home run leader lists, but other generations took advantage of the rules, strategies, and ballparks of their eras to dominate other leader lists, too. Separating ballplayers by generation restores meaning to the leader-boards, and highlights not only how the different generations performed, but which players excelled within their generation.

Return of the Mayor

(Originally posted July 22, 2012)

With Joey Votto recovering from surgery, and the Reds in need of some offense, and especially some left-handed-hitting offense, there's been talk around town (well, on 700 WLW, mainly) of the Reds signing Sean Casey.  A lifetime .302 hitter, Casey was the Reds' starting first baseman from 1998-2005, and following stints with the Pirates, Tigers, and Red Sox, retired after the 2008 season at age 34.  The beloved Casey has since been working as a broadcaster, and is a 2012 inductee into the Reds Hall of Fame.  Although I doubt the sincerity of the radio personalities who are pining for his return (Tracy Jones especially), the so-called justification for believing Casey could still contribute is Jim Thome (who blocked Casey at first base on the Cleveland Indians, before Casey was traded to Cincy for Dave Burba the day before the start of the '98 season).  Thome, of course, is 41 (42 in five weeks) and is still getting it done.  Since the start of the 2011 season, he has been to the plate 453 times, and is batting .256 with 22 homeruns, 70 rbi, and a .358 on-base percentage.  Casey, meanwhile, just turned 38 on July 2.

Casey has been out of the game for four years now, and any talk of him returning is probably just that. Still, I thought it would be fun to try and project what Casey's numbers would be if he did return to action in 2012. And the basis for this experiment will be the prototype for aging first basemen: Mr. Thome himself. Leaving out Casey's cup of coffee with the '97 Indians, where he went 2-for-10, here are Casey's career numbers:

Years     Ages  G    AB   R   H    2B  3B HR  RBI SB BA   OBP  SLG  OPS  GDP
1998-2008 23-33 1399 5056 689 1529 322 12 130 734 18 .302 .367 .448 .815 156

Casey's playing days lasted from the age of 23 to age 33. Here are Jim Thome's numbers at the same age range:

Years     Ages  G    AB   R    H    2B  3B HR  RBI  SB BA   OBP  SLG  OPS  GDP
1994-2004 23-33 1565 5357 1082 1535 299 21 413 1120 13 .287 .415 .581 .997 91

Sean Casey's baseball age this year is 37 (your baseball age is your age on June 30). Here is how Thome performed at age 37:

Year Age G   AB  R  H   2B 3B HR RBI SB BA   OBP  SLG  OPS  GDP
2008 37  149 503 93 123 28  0 34  90  1 .245 .362 .503 .865  17

It follows then, in this simple experiment, that if we divide Casey's career numbers by Thome's, and multiply it by Thome's numbers at age 37, that we should be able to extrapolate 2012 statistics for Sean Casey. Well I did just that:

Year Age G   AB  R  H   2B 3B HR RBI SB BA   OBP  SLG  OPS  GDP
2012 37  133 475 59 123 30  0 11  59  1 .258 .318 .388 .705  29

Remember, this projection assumes that Sean Casey is in shape and been playing every season, not retired since 2008. That aside, if a tuned-up Casey were to play in 133 games this season, these numbers seem fairly realistic to me. Especially the 29 double plays.

The Year Babe Ruth hit .402 with 61 Homeruns, and other Year-long Hot Streaks

(Originally posted June 15, 2012)

On May 10, 1920, Babe Ruth was 20 games into his first season with the New York Yankees.  He had played in 18 of those games, and was batting .210 with 2 homeruns.  On May 11 in Chicago, he went 3-3 with a triple and two homeruns, kicking off arguably the greatest hot streak in baseball history, over the course of which he would become the most famous athlete in the history of American sports.  His 1920 season totals are enough for it to be ranked among the greatest seasons in baseball history: a .376 average, 54 homeruns (breaking the previous record of 29, set by Ruth the year before), an .847 slugging percentage (the record until 2001).  But from May 11 on, he batted .403 with a .924 slugging percentage.  Ruth was arguably even better in 1921, because unlike in 1920, he started the season hot, busting nine homeruns in his first 18 games.  Over the 365-day period beginning May 11, 1920, the Babe put up these outrageous numbers:

Babe Ruth (May 11, 1920 to May 10, 1921)

142 463 169 186 37 9  61 146 154 74 13 14 .402 .553 .916 1.469

Ruth never actually hit .400 in any one season; he topped out at .393 in 1923.  Rogers Hornsby, however, accomplished the feat three times ... in fact, he hit .402 over the course of an entire five-year span (1921-1925).  Halfway through the 1924 season, Hornsby was having an okay year - he was hitting .389.  But then he decided to kick it up a notch.  He finished the season hitting .424 ... the highest batting average of the liveball era.  He continued his outrageous hitting into the following season, 1925, and cruised to his second triple crown, hitting .403 with 39 homers and 143 rbi.  Here's what he did from the second half of the 1924 season through the first half of '25:

Rogers Hornsby (July 1, 1924 to June 30, 1925)

144 524 146 230 43 9  38 129 101 29 3  6  .439 .530 .773 1.303

No one has hit .400 since Ted Williams in 1941, and it is likely that no one ever will again; but for twelve months in the 1920s, Rogers Hornsby hit damn near .440, and with power too.  Speaking of players we may never again see the likes of: in the 1980s Eric Davis was supposed to be the second coming of Willie Mays.  And once he actually got into the regular line-up in June of 1986, for a year or so he was Willie Mays, only with more speed.

Eric Davis (June 15, 1986 to June 14, 1987)

148 517 132 154 24 3  43 114 87 133 91 10 .298 .398 .605 1.003

Only four players in baseball history have reached the 40-40 club in homeruns and stolen bases.  But over a one-year period in 1986 and '87, Eric the Red was a 40-90 player.  It's mind-boggling how good he was before injuries took their toll on his game.

In 1998 Sammy Sosa hit 66 homeruns for the Cubs...yet he was second among Chicago outfielders in slugging percentage.  Albert Belle quietly had a spectacular season for the White Sox; he was overshadowed by Sosa and Mark McGwire as they both broke Roger Maris's 37-year-old homerun record.  But three years earlier, Belle was the most feared hitter in baseball, and the favorite to break Maris's record.  In 1995, he became the first (and still only) player to hit 50 doubles and 50 homeruns in the same season, and did so despite playing a shortened 144-game schedule.  He started well in 1995, then went on a tear beginning around the end of May, and continued his torrid hitting through the start of the '96 season.  The result is a year that Belle never actually had in any one season; he was Sammy Sosa from a few years later, but with more hits and doubles and half the strikeouts:

Albert Belle (May 31, 1995 to May 27, 1996)

162 608 143 203 51 2  65 159 89 88 5  2  .334 .420 .745 1.165

I cut Belle off a little early because he actually played 165 games for the year beginning May 31, 1995.  And speaking of the homerun record, we come full-circle to the man who holds both the single-season and career marks.  Barry Bonds hit 73 in 2001, but if not for a bad first week, he might have hit several more.  He went 3-for-29 in the Giants' first seven games, got benched in game 8, then basically went on an unprecedented tear that lasted through the rest of that season and all of the next three.  Here's what he did in that first year, from early in the 2001 season through the first week of 2002:

Barry Bonds (April 12, 2001 to April 11, 2002)

154 470 136 162 33 2  77 148 185 88 14 3  .345 .535 .915 1.449

This is the highest level of dominance displayed by any player since Babe Ruth, eighty years before.  And for all the differences in the eras they played in - Ruth in day games in big ballparks against white-only competition, Bonds in small parks under the lights with PEDs against the best players in the world - their slugging percentages are nearly identical: .916 for Ruth, .915 for Bonds.  These are the two players who, more than anyone else who's ever played the game, reached the pinnacle of prowess for an entire year.