Dear BEMCers,
We are pleased to be able to share a summary prepared by a student earning credit for participating in the BEMC. A warm thank you to Ana Sofia Oliveira Gonḉalves for letting us share her summary!
Maarten’s slides can be found online: https://www.slideshare.net/MaartenvanSmeden/regression-shrinkage-better-answers-to-causal-questions
As a reminder, we will be having BEMC JClub on Wednesday this week. A link to the article is available under JCLUB.
On Wednesday March 6th, Maarten van Smeden, a senior researcher from the Leiden University Medical Center (NL) shared with the audience valuable insights on coefficient shrinkage in regression, both in a prediction and in a causal research context, with a focus on the latter. The starting point of this lecture was to question the appropriateness of traditional ways of computing the odds ratios (e.g. in a 2×2 table or by standard logistic regressions based on maximum likelihood estimation). Maarten explained that the maximum likelihood estimators for regression coefficients in generalized linear models are biased but consistent.
Throughout the lecture, Maarten used simple logistic regression model as an example. Based on such a model, he provided us with graphical representations of his simulations to show the properties of such estimators. With this simulation he intended to stress the difference between the two concepts of lack of bias and consistency. He then presented us with a solution for the reduction of bias for maximum likelihood estimators: Firth’s correction.
Firth’s correction is a penalized estimation procedure that shrinks regression coefficients, thereby removing a large part of the finite sample bias. The Firth’s correction can be readily implemented in a causal research context and packages in statistical programs already exist. Maarten mentioned other shrinkage estimators, such as Ridge or LASSO can conducted for prediction purposes, since biased coefficients are better suited for this purpose. Nevertheless, he warned the audience against its use in a causal inference context, since these approaches are designed to create bias in coefficient estimators, rather than to remove it.