L1 regularization
Nov. 25th, 2008 08:57 pmL2 regularization is seen a way to avoid overfitting when doing regression, no more.
L1 regularization tends to give sparse results. If the truth is sparse, this is seen as a way to get to the truth (although this is not always consistent, which is why we have Bolasso).
Even if the truth is not sparse, L1 may be seen as an Occam's razor. Is this a valid view?
Even if the truth is not sparse, L1 is a way to select a small number of variables, which can be useful for those of us concerned with scarce computational resources (although it's not clear why you'd choose L1 over PCA or Partial Least Squares)
L1 regularization tends to give sparse results. If the truth is sparse, this is seen as a way to get to the truth (although this is not always consistent, which is why we have Bolasso).
Even if the truth is not sparse, L1 may be seen as an Occam's razor. Is this a valid view?
Even if the truth is not sparse, L1 is a way to select a small number of variables, which can be useful for those of us concerned with scarce computational resources (although it's not clear why you'd choose L1 over PCA or Partial Least Squares)