NBA Game Simulator: A Monte-Carlo Simulation (Project Overview)
In the following post, I will discuss an overview of a project completed in R, as well as give applications and motivations.
Part 1: Project Overview
The National Basketball Association (NBA) simulator built in this project was developed with one major goal in mind: to be the most accurate simulation of NBA basketball that I could reasonably create. To do this, I built a data frame of advanced team statistics, team tendencies, and team per game statistics. I then built functions to simulate a possession, a game, and ultimately many games. This would then simulate two teams playing each other based on the advanced stats, tendencies, and per game stats. In addition to using each team’s information, I also used general NBA data (such as game pace or possessions per game) to build these functions.
In this paper, I will first discuss how a simulator of this type could be used and the benefits of this specific type of simulator. Next, I will discuss the data used in this project, and the process of creating a data frame that fits specific requirements. Then I will give an example of one of the simulation functions, and how the one small idea (simulating free throws) could be expanded upon to simulate an entire NBA game. I will then explain each of the functions and game events that go towards building a box score, and also give an example of two teams that exemplify the accuracy of this simulator. Subsequently, I will describe the process of turning the simulator into an R Shiny App that anyone can use, and finally I will discuss future studies related to this work as well as future improvements that could be made.
Potential Applications and Motivation
When it comes to motivations for creating the simulation in this project, there are many major applications that all come to mind. The first being if NBA front offices wanted to build hypothetical “teams” of aggregate player tendencies and run game simulations. Additionally, because the stats are not NBA-specific, front offices could estimate how they expect players not currently on their team to perform and then make free agency and draft decisions based on the simulation results. All that would be required to do this type of simulation is to make a dataframe of all the individual players’ stats that a team is interested in, take the means (or weighted means by expected playing time), and run the “game” function with the calculated means.
Additionally, because the simulator can predict full box scores that include all team stats, it could also be useful for NBA teams to find weaknesses in their team’s gameplan, and convey important findings to the coaches. For example, if the simulator shows a team playing an opposing team, and they have an unusually high number of turnovers, the first team could have the coaches emphasize ball security to their players for the upcoming game.
Another application of this simulation is to create simulation video games such as “NBA 2k”, or “Basketball GM” and match up teams of different eras. What if you could play a “dynasty” mode in your favorite basketball game, taking the best teams from all-time, but instead of waiting half an hour to simulate a season, it could be done in two minutes? This simulator would excel in that scenario. Because each individual box score is still saved, gamers using the simulator would still be able to enjoy each individual game, while also comparing teams from different seasons. There is a whole sub-genre of sports video games dedicated to simulations. By building accurate ratings systems teams, the simulation designed in this project is realistic enough to satisfy gamers’ expectations of realism, while also being efficient enough to run thousands of simulations at once.
The third application of this simulation is for the sports fans who are interested in predicting game and playoff series outcomes. The biggest example of this that comes to mind is sports betting (which I do not recommend and was not the goal of this study), but it could also be a good tool when people make their NBA playoff predictions, and also when they are evaluating playoff chances based on simulated season outcomes, similar to fivethirtyeight’s game predictors.
When it comes to applying this simulator, the possibilities and applications are nearly endless, but, as mentioned above, there are three major benefits to the simulator that was built: the first being the speed of simulation (being able to efficiently run thousands of simulations on a per-possession basis) and a second being able to compare teams from different years. The first thing that anyone asked when told about this project? “How would the 1997 Bulls compare to the 2016 Warriors?” which leads us to the final benefit: because the simulator was turned into an R Shiny App, anyone can access it and test real-life NBA teams on their own, without having to understand the details of computer programming languages.
Hi, I am currently working on a project of my own somewhat similar to Moneyball but applied to the NBA so I find what you've done here very interesting. Naturally, at the end of the project I aim to test out the team I have built and see how it compares to expected and real life performance. I am looking for a good simulation program as I don't just want to rely on NBA2K's sim feature - if you've already built a successful model, would you be so kind as to share the app? also, could one put together a team they've built themselves or would it have to be a team that has at some point existed/currently exists? Thanks!