Normal Distribution

Editor’s Note: In this section, I’ll break down some of the key aspects in probability theory that shape the basis for this website. First, I look at the basic concept behind normal distribution. Please note these explanations won’t be 100% up to mathematical textbook standards, simply because these explanations need to be shaped into a sports context. If there are any concerns or criticisms about the process of applying probability theory into a sports context, please contact me at tabmathletics@gmail.com.

Pivotal to statistical analysis, normal distribution gets relatively overlooked by the stats freaks. That’s quite a surprise, considering that normal distribution can may be one of the best assets to any analyst looking to make a projection based on years of statistical research.

Fan graphs uses normal distribution to help build a range for their player projections, which is absolutely helpful in their process of making player projections for each MLB season. Meanwhile, there is some normal distribution applied to player evaluations in baseball. Most of it is for paying eyes only (which I’m not in financial standing to do), so I’ll admit my examples are limited. Luckily, I found one blog post that reference evaluating player skills with normal distribution in mind.

For the record, I am absolutely not in the camp that skills should be quantified. A “skill” isn’t something that can be tangibly measured, only the outcomes derived from that skill applied into a real-life situation, like an actual baseball game itself. (Isn’t that what Sabermetrics are for?)

Even with those examples, there’s one point of analysis truly missing from sports analysis. Thanks to normal distribution, we can infer which statistical events are so rare that they reasonably will not happen again. It certainly won’t apply to all situations, but it can help greatly to manipulate certain quality stats properly.

For these purposes, understanding normal distribution is simple. When there’s normal distribution, there is a higher probability density (or chance of something happening) as it gets closer to the mean (or statistical average). Therefore, the statistical events that are farthest from the mean will have the least likely chance of happening. Look at the graph below:

Now, there’s one caveat to this, and it involves the Central Limit Theorem. More or less, the theorem states that the sample set will be approximately normally distributed if its sample size gets sufficiently large. Therefore, our normal distribution is more reliable with larger samples and more research. Without understanding this, some basic errors will be made when making statistical projections.

Throughout this website, I will use my sports and mathematical judgment to find certain statistical events that reasonably fall on the outer edges of normal distribution (think the lightest blue areas in the above graph). Finding these rarities, better known as outliers, will allow us to see what statistics are likely to regress to the mean.

That will come later in this section, but first, let’s discuss why exactly why the NFL brings such a unique advantage to those who use probability theory in their seasonal projections.