System to forecast baseball player's performance

Nathaniel Read "Nate" Silver : writer
PECOTA forecasts a player's performance in all of the major categories used in typical fantasy baseball games; it also forecasts production in advanced sabermetric categories developed by Baseball Prospectus (e.g., VORP and EqA). In addition, PECOTA forecasts several summary diagnostics such as breakout rates, improve rates, and attrition rates, as well as the market values of the players. The logic and methodology underlying PECOTA have been described in several publications, but the detailed formulas are proprietary and have not been shared with the baseball research community.
PECOTA compares each player against a database of roughly 20,000 major league batter seasons since World War II. In addition, it also draws upon a database of roughly 15,000 translated minor league seasons (1997-2006) for players that spent most of their previous season in the minor leagues. . . . PECOTA considers four broad categories of attributes in determining a hitter's comparability: 1. Production metrics – such as batting average, isolated power, and unintentional walk rate for hitters, or strikeout rate and groundball rate for pitchers. 2. Usage metrics, including career length and plate appearances or innings pitched. 3. Phenotypic attributes, including handedness, height, weight, career length (for major leaguers), and minor league level (for prospects). 4. Fielding Position (for hitters) or starting/relief role (for pitchers). . . . In most cases, the database is large enough to provide a meaningfully large set of appropriate comparables. When it isn't, the program is designed to 'cheat' by expanding its tolerance for dissimilar players until a reasonable sample size is reached.