In Bayesian stats the posterior distributions are almost invariably too gnarly to compute anything analytically so you have to resort to sampling if you want to find the mean or the mode or whatever.
I've not done any learning with sampling but have used it for inference. I guess there are a couple of ways to do it: You could infer the posterior on the "learned" variables like a regular inference problem using samples, then use the samples to reconstruct an approximate posterior. You could assume some parametric form and use the samples as observation points to update it. Or do some nonparametric approximation like putting like radial basis functions around each point. In the limit of this, you can use the set of samples itself as the posterior -- then you have a Particle Filter. This is typically used in real-time apps (eg music tracking) where you dont have time to do anything clever, but you can propagate a set of samples over time as your hypothesis space. (see my paper "How to Be Lost" on 5m.org.uk for an eg of this.) Alternatively to all of the above -- another method would be to use standard gradient descent o learn the paprameters, using your sampler to estimate P(Date|params) for each param setting that you search though. See Will Penny's paper in the "Bayesian Brain" book for an eg of this. charles
(no subject)
Date: 2007-07-14 08:27 am (UTC)(no subject)
Date: 2007-07-14 08:33 am (UTC)So, if you have a prior, and you observe some data, and now you want to update your beliefs, would you use sampling?
(no subject)
Date: 2007-07-15 06:39 pm (UTC)I guess there are a couple of ways to do it:
You could infer the posterior on the "learned" variables like a regular inference problem using samples, then use the samples to reconstruct an approximate posterior. You could assume some parametric form and use the samples as observation points to update it. Or do some nonparametric approximation like putting like radial basis functions around each point. In the limit of this, you can use the set of samples itself as the posterior -- then you have a Particle Filter. This is typically used in real-time apps (eg music tracking) where you dont have time to do anything clever, but you can propagate a set of samples over time as your hypothesis space. (see my paper "How to Be Lost" on 5m.org.uk for an eg of this.)
Alternatively to all of the above -- another method would be to use standard gradient descent o learn the paprameters, using your sampler to estimate P(Date|params) for each param setting that you search though. See Will Penny's paper in the "Bayesian Brain" book for an eg of this.
charles