Millions of people play fantasy baseball and typically select players for their team through classic drafts or auctions. Choosing a team based on your favorite players--or simply on last year's performance without any regard for regression to the mean--is likely to field a weaker team. Baseball is one of the most well documented of all sports, statistics-wise. With the wealth of collected information, you can derive a better estimate of each player's true talent level and their likely performance in the coming year using machine learning. This allows for better drafting and also helps avoid overpaying for players coming off of "career" seasons.


When drafting players for fantasy baseball, you must make decisions based on the player's performance over their career to date, as well as effects like aging. Basing evaluation on personal interpretation of the player's performance is likely to cause you to overvalue a player's most recent performance. In other words, it's common to overvalue a player coming off a career year or undervalue a player coming off a bad year.


The goal is to generate a better prediction of the player's performance in the next year based on what he has done in prior years, and from patterns you can learn from similar players in the past. If you build a machine learning model to predict a player's performance in the next year based on their previous performance, it will help you identify when these over- or under-performances are flukes versus when they are actual indicators of that player’s future performance. Further you'll be able to create ranked lists by position to help you on draft day.


About this Accelerator

In this accelerator, we will leverage the DataRobot API to quickly build multiple models that work together to predict common fantasy baseball metrics for each player in the upcoming season. 


What you will learn  

  • How to query a rich dataset of MLB players from Fangraphs' API
  • How to set up a project with automated time-aware feature engineering (Automated Feature Discovery)
  • How to update the player data (i.e., secondary data) to re-predict without building a new project
  • How to loop over a project creation function to build many DataRobot projects automatically
Version history
Last update:
‎09-05-2023 10:00 PM
Updated by: