Statistical Modelling 18 (5-6) (2018), 388–410

Nuclear penalized multinomial regression with an application to predicting at bat outcomes in baseball

Scott Powers
Department of Statistics,
Stanford University,
Stanford, CA,
USA.
e-mail: saberpowers@gmail.com

Trevor Hastie
Department of Statistics,
Stanford University,
Stanford, CA,
USA.


Robert Tibshirani
Department of Statistics,
Stanford University,
Stanford, CA,
USA.


Abstract:

We propose the nuclear norm penalty as an alternative to the ridge penalty for regularized multinomial regression. This convex relaxation of reduced-rank multinomial regression has the advantage of leveraging underlying structure among the response categories to make better predictions. We apply our method, nuclear penalized multinomial regression (NPMR), to Major League Baseball play-by-play data to predict outcome probabilities based on batter–pitcher matchups. The interpretation of the results meshes well with subject-area expertise and also suggests a novel understanding of what differentiates players.

Keywords:

multinomial regression; Reduced-rank regression; baseball; nuclear norm; proximal gradient descent.

Downloads:

Example data and code in zipped archive.
back