Use Case: Predicting fantasy baseball

cancel
Showing results for 
Search instead for 
Did you mean: 

Use Case: Predicting fantasy baseball

Predicting a player’s future offensive performance based on the past


What’s the challenge?

Millions of people play fantasy baseball using leagues that are typically draft- or auction-based. Choosing a team based on your favorite players or just the last year’s performance is likely to field a weaker team. Baseball is one of the most “documented” of all sports, statistics-wise. With the wealth of collected information, you can derive a better estimate of each player’s true talent level and their likely performance in the coming year. This allows for better drafting as well as helps avoid overpaying for players coming off of one-of-a-kind (“career”) seasons.

The challenge and solution

When drafting players for fantasy baseball, you must make decisions based on each player’s performance over their career to date. Basing evaluation on personal interpretation of a player’s performance is likely to overvalue the player’s most recent performance. In other words, it’s common to overvalue a player coming off a career year or undervalue a player coming off a bad year. The goal is to generate a better estimate of the player’s value in the next year based on what he has done in prior years. If you build a machine learning model to predict a player’s performance on the next year based on their prior performance, you can build a system that will help you identify when these over or under performances are flukes or a sign of a change in the player’s future performance.

For example…

Sam is a baseball fanatic with a good gut-feel for the players but nothing to back up the hunches. Winning her head-to-head fantasy baseball league brings both bragging rights and potential financial gains. Sam is going to use an existing baseball database of past performance to identify those players with impressive weighted on-base averages (wOBA), selecting that statistic because it is a catch-all metric that indicates overall offensive value at the plate.


Training data: https://s3.amazonaws.com/datarobot-use-case-datasets/baseball_modeling.csv
Prediction Data: https://s3.amazonaws.com/datarobot-use-case-datasets/baseball_scoring.csv
Data Dictionary: https://s3.amazonaws.com/datarobot-use-case-datasets/Fantasy+Baseball+-+Data+Dictionary.pdf

(Originally posted August 2017)

Version history
Revision #:
6 of 6
Last update:
‎12-17-2019 02:48 PM
Updated by: