Considerations: December 2021

Wednesday, December 29, 2021

Standardization

The attached study answers three questions:

Does it make a difference whether you standardize your variables before running your regression model or standardize the regression coefficients after you run your model?
Does the scale of the respective original non-standardized variables affect the resulting standardized coefficients?
Does using non-standardized variables vs. standardized variables have an impact when conducting regularization?

The study uncovers the following answers to those three questions:

It makes no difference whether you standardize your variables first or instead standardize your regression coefficients afterwards.
The scale of the original non-standardized variables does not make any difference.
Using non-standardized variables when conducting regularization (Ridge Regression, LASSO) does not work at all. In such a situation (regularization) you have to use standardized variables.

To check out the complete study (very short just 7 slides) go to the following link.

Standardization study at Slideshare.net

Thursday, December 23, 2021

Is Tom Brady the greatest quarterback?

If you want to review the entire study, you can view it at the following links:

Football study at Slideshare.net

Football study at SlidesFinder.com

The above studies include extensive use of the binomial distribution that allows differentiating how much of the quarterbacks' respective records are due to randomness vs. how much is due to skills. This statistical analysis is not included within this blog post. (The study at SlidesFinder may not include this complete section right now, but it should within a few days).

The quarterbacks I looked at include the following:

Performance during the Regular Season.

If we look at Brady's performance during the regular season at mid career (34 years old), he actually is far behind many of his peers.

First, let's look at cumulative yards passed by 34 years old.

Next, let's look at number of touch downs by 34 years old.

As shown above in both yards and touch downs, at 34 years old Brady is way behind Manning, Marino, Brees, and Favre.

At this stage of his career and on those specific counts, Brady does not look yet earmarked to become a legendary number 1.

However, Brady's career longevity and productivity is second to none. And, when you compare the respective records over an entire career, the picture changes dramatically.

Brady's ability to defy traditional age sports curve is remarkable. He just has not shown any decline in performance in age. At 44, he is just as good as 34... unlike any of his peers who have been out of the game for years. They all retired by 41 or earlier.

Track record during the Post-Season.

During the Post-Season it is a very different story. Brady has been dominant throughout and since early on in his career. He leads in number of Play Offs.

He is way ahead in number of Super Bowl games.

And, way ahead in Super Bowl wins.

The table below discloses the performance of the players during the Post-Season.

Given the number of teams in the NFL (32), and number of seasons played, the above players have a random proportional probability of winning one single Super Bowl ranging from 50% (for Montana) to 66% (for Brady). That probability based on just randomness drops rapidly to close to 0% of winning 2 Super Bowls. Notice that Marino, Brees, and Favre actual records are in line with this random proportional probability. This underlies how truly difficult it is to win more than one Super Bowl. Manning and Elway do not perform much above this random probability. Only Montana and Brady perform a heck of a lot better than random probabilities would suggest based on the number of seasons they played. And, as shown Brady with 7 is way ahead of Montana. And, he is not done!

When looking at the Post-Season track record, there is no doubt that Brady is the greatest. Under pressure, and when it counts he scores. Also, interesting even when he looses in a Super Bowl game, it is a close game. He does not get wiped out. By contrast some of the other quarterbacks (including Marino, and Elway among others) suffered truly humiliating lopsided defeats in the Super Bowl... not Brady.

Friday, December 10, 2021

Why you should avoid Regularization models

This is a technical subject that may warrant looking at the complete study (33 slides Powerpoint). You can find it at the two following links.

Regularization study at Slideshare.net

Regularization study at SlidesFinder.com

If you have access to Slideshare.net, it reads better than at SlidesFinder.

Just to share a few highlights on the above.

The main two Regularization models are LASSO and Ridge Regression as defined below.

The above regularization models are just extension of OLS Regression (yellow argument) plus a penalization term (orange) that penalizes the coefficient levels.

Regularization models are deemed to have many benefits (left column of table below). But, they often do not work as intended (right column of table below).

In terms of forecasting accuracy, the graphs below show the penalization or Lambda level on the X-axis. As Lambda level increases from left to right, penalization increases (regression coefficients are shrunk and eventually even zeroed out in the case of LASSO models). And, the number of variables left in the LASSO models decreases (top X-axis). The Y-axis shows the Mean Squared Error of those LASSO models within a cross validation framework.

The above graph on the left shows a very successful LASSO model. It eventually keeps only 1 variable out of 46 in the model, and achieves the lowest MSE by doing so. By, contrast the LASSO model on the right very much fails. Close to the best model is when Lambda is close to Zero which corresponds to the original OLS Regression model before any Regularization (before any penalization resulting in shrinkage of the regression coefficients).

Revisiting these two graphs and giving them a bit more meaning is insightful. The LASSO model depicted on the left graph below was successful as it clearly reduced model over-fitting as intended as it increased penalization and reduced the number of variables in the model. The LASSO model on the right failed as it increased model under-fitting the minute it started to shrink the original OLS regression coefficients and/or eliminated variables.

Based on firsthand experience the vast majority of the Ridge Regression and LASSO models I have developed resulted in increasing model under-fitting (right graph) instead of reducing model overfitting (left graph).

Also, when you use Regularization models, they often destroy the original explanatory logic of the original OLS Regression model.

The two graphs below capture the regression coefficient paths as Lambda increases, penalization increases, and regression coefficients are progressively shrunk down to close to zero. The graph on the left shows Lambda or penalization increasing from left to right. The one on the right shows Lambda increasing from right to left. Depending on what software you use, those graphs respective directions can change. This is a common occurrence. Yet, the graphs still remain easy to interpret and are very informative.

The above graph on the left depicts a successful Ridge Regression model (from an explanatory standpoint). At every level of Lambda, the relative weight of each coefficient is maintained. And, the explanatory logic of the original underlying OLS Regression model remains integer. Meanwhile, on the right graph we have the opposite situation. The original explanatory logic of the model is completely dismantled. The relative weight of the variables dramatically change as Lambda increases. And, numerous variables coefficients even flip sign (from + to - or vice versa). That is not good.

Based on firsthand experience several of the Regularization models I have developed did dismantle the original explanatory logic of the underlying OLS Regression model. However, this unintended consequence is a bit less frequent than the increasing of model under-fitting shown earlier.

Nevertheless for a Regularization model to be successful, it needs to fulfill both conditions:

a) Reduce model overfitting; and

b) Maintain the explanatory logic of the model.

If a Regularization model does not fulfill both conditions, it has failed. I intuit it is rather challenging to develop or uncover a Regularization model that does meet both criteria. I have yet to experience this occurrence.

Another source of frustrations with such models is that you can get drastically different results depending on what software package you use (much info on that subject within the linked Powerpoint).

One of the main objectives of Regularization is to reduce or eliminate multicollinearity. This is such a simple problem to solve by simply eliminating the variables that appear superfluous within the model (much info on that within the Powerpoint) and are multicollinear to each other. This is a far better solution than using Regularization models that are highly unstable (different results with different packages) and that more often than not fail for the mentioned reasons.