gusl: (Default)
[personal profile] gusl

fit (a.k.a. noise) penalty / regularization penalty
noneL2
L1
L2"ordinary" regression (MLE Gaussian noise)ridge regression (MAP under Gaussian noise + white Gaussian prior)Lasso (MAP under Gaussian noise (?) + Laplace prior)
L1   


Mixtures of L1 and L2 on the regularization penalty are called "elastic nets".

L2/L2 is can be implemented just as easily as L2/none, by adding fake data points.  L2/L1 cannot.

You may notice that the second row is missing. This is because I've never seen regression with noise penalty other than L2 (a.k.a. "squared error").

(no subject)

Date: 2009-07-22 03:29 pm (UTC)
From: [identity profile] bhudson.livejournal.com
If L1 is your error metric, the average is all you get out, isn't it?

(no subject)

Date: 2009-07-22 06:38 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
First of all: we're assuming we're doing "regression" with only one x-point.

Minimizing the usual error metric (L2), you get the mean. (Physics analogy: if you want to minimize the moment of inertia, make your pivot at the mean, see previous post)

I believe minimizing L1 gives you the median (in fact, for even numbers, the objective is flat between the 2 central points).

I think that if each x has an even number of ys, then L1 regression has such a "fat minimum" with probability 1 (though the bigger the dataset, the smaller the region).

(I went back to bed after you told me this, and my dream-genie said: "to break ties, use an infinitesimal penalty")

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags