STEM and March Madness

How Statistics Affect March Madness

Surprisingly, statistics are all around us. Without many of us being aware, we use or are affected by statistics or statistical analysis almost on a daily basis.

Yet, during a certain time of year, all college basketball fans around the world go into statistical overdrive with the beginning of March Madness!

March Madness is the Men’s NCAA Division I basketball tournament. The tournament was founded in 1939 and has changed over the years to include more teams and schedule them to play each other based on seeding teams.

In scheduling teams to play each other, the tournament considers statistics from their performance throughout the season in order to “fairly” match teams up.

However, where the real statistics come in are with the fans. Families, offices, communities, and friend groups from around the country try to correctly guess the outcome of the 64 team tournament using statistics. Without using statistics, in the first round of the tournament alone, there could be 2 to the 23 possible brackets…that is 4,294,967,296 different outcomes!

With the use of statistics, fans are able to narrow down their chances of guessing the correct bracket significantly. Lots of fans build upon the foundation of the outcomes of past tournaments and trust the tournament official’s logic when it comes to seeding. For example, the #1 vs. #16 seed; the #1 seed has won 100%of the time against the #16 seed. Not to say there haven’t been close calls!

In addition, there are other historical outcomes that can help to narrow a bracketer’s choices such as the lowest see to win the tournament is a #8 seed.

Yet, past performance is not the only indicator that can be utilized to narrow down the pool. You can also look at the statistics associated with individual team’s performance throughout the year.

Some of those stats could include highest scoring team, fastest team, best free-throw shooting team, best 3-point shooting team, and all of the teams that rank lowest for those metrics.

While for some that is deep enough, others have gone deeper into this analysis to try and figure out the top 15 most important statistics for detecting upsets. The difference between those 15 statistics create a profile of past upsets. That profile is then applied to the current year’s bracket to identify the most likely upsets for this year.

A paper published in the American Statistical Association Journal of Quantitative Analysis in Sports was written by Sheldon Jacobson, Jason Sauppe, and Shouvik Dutta. The method the found explains a technique to detect potential upsets using a small number of publicly available statistics. This framework was named Balance Optimization Subset Selection or BOSS) and can actually be used in data associated with many areas of social sciences and medicine. To learn more about the collision of statistics, basketball, and this interesting new statistical framework, read more here.

Are you filling out a bracket this year? How do you plan to apply the ideas of statistics to making the right choice?

Did you enjoy this post? Please share on Pinterest belo!

Stats in March Madness.png