Sunday, October 16, 2016

The Generations of Baseball, part one

"A generation, like an individual, merges many different qualities, no one of which is definitive standing alone. But once all the evidence is assembled, we can build a persuasive case for identifying (by birthyear) eighteen generations over the course of American history. All Americans born over the past four centuries have belonged to one or another of these generations."
That is the claim made by authors William Strauss and Neil Howe on page 68 of Generations. The book, published in 1991, "retells the history of America as a series of generational biographies going back to 1584."

Below is a table of the last ten of those generations - the generations that played BASEBALL - along with their total MLB-playing population (through 2016) and best (or best-known) player:

GENERATIONBIRTH YEARSMLB POP.FAMOUS MEMBER
Transcendental1792-18210Alexander Cartwright
Gilded1822-184221Harry Wright
Progressive1843-1859688Cap Anson
Missionary1860-18822,129Cy Young
Lost1883-19002,573Babe Ruth
G.I.1901-19242,693Ted Williams
Silent1925-19421,818Willie Mays
Boom1943-19602,602Rickey Henderson
Generation X1961-19813,926Barry Bonds
Millennial1982-20042,332Mike Trout

But these are the generations of AMERICAN history - social generations. Their members encountered "the same national events, moods, and trends at similar ages" (page 48) and felt "the ebb and flow of history from basically the same age or phase-of-life perspective" (page 64).

Baseball has its own national events, moods, and trends; its own ebbs and flows. Does it have its own generations, then, too? And can we find them by applying Strauss & Howe's methodology to sabermetrics?
"...We have to start with the building-block of generations: the ‘cohort.’ Derived from the Latin word for an ordered rank of soldiers, ‘cohort’ is used by modern social scientists to refer to any set of persons born in the same year; ‘cohort-group’ means any wider set of persons born in a limited set of consecutive years" (page 44).
"A GENERATION is a cohort-group whose length approximates the span of a phase of life and whose boundaries are fixed by peer personality" (page 60). There are four "phases of life" (on page 56), each 22 years long: youth (ages 0 to 21), rising adulthood (22 to 43), midlife (44 to 65), and elderhood (66 to 87).
"But how do we actually identify [a generation]? For that, we have to focus our attention on ‘peer personality’ - the element in our definition that distinguishes a generation as a cohesive cohort-group with its own unique biography. The peer personality of a generation is essentially a caricature of its prototypical member. It is, in its sum of attributes, a distinctly personlike creation" (page 63).
"A PEER PERSONALITY is a generational persona recognized and determined by (1) common age location; (2) common beliefs and behavior; and (3) perceived membership in a common generation" (page 64). But while Strauss & Howe (on page 67) "can apply no reductive rules for comparing the beliefs and behavior of one cohort-group with those of its neighbors," we can apply similarity scores.

Bill James devised similarity scores in the 1980s as a means to identify groups of comparable players based on their career statistics. "To compare one player to another, start at 1000 points and then subtract points based on the statistical differences of each player."

In July 2015, Bill used similarity scores to compare seasons (like MLB 2009 and MLB 2010) and organize them into eras and epochs (subscription required to read either article).
"My thinking was that if I did these studies in an organized, systematic way, 'fault lines' would appear. Natural groups of seasons would be apparent in the data, I thought; we could call the smaller groups, 5 to 10 years, 'epochs' and the larger groups 'eras'....And, in fact, this approach does, in most cases, make it very clear where the lines should be drawn."
For Bill, a smaller group of seasons is an epoch and a larger group is an era. For Strauss & Howe, a smaller group of cohorts is a cohort-group and a larger group is a generation.

Bill "measured the Similarity of Seasons...by the similarity of their statistical image" to identify "natural groups of seasons" and find the "fault lines" separating them.

Strauss & Howe "focus...on 'peer personality'" to identify "a cohesive cohort-group" "and find the boundaries separating it from its neighbors" (page 64).

For a baseball player, then, his "personality" IS his statistical image.

So, ignoring the third part of peer personality ("perceived membership"), Strauss & Howe on page 64 offer us two possible paths to identifying baseball generations:

We can "look first at [a cohort-group's] chronology: its common age location, where its lifecycle is positioned against the background chronology of historic trends and events," meaning we can identify eras first, then place cohorts in generations based on which era they primarily played in.

Or, we can "look at [a cohort-group's] attributes: objective measures of its common beliefs and behavior," meaning we can apply Bill James' method to cohorts directly; that is, use similarity scores to find "natural groups" of cohorts - generations. I'm going with this approach.

Continued in part two.

No comments:

Post a Comment