Statistics

Oct. 20th, 2007 02:34 pm
gusl: (Default)
[personal profile] gusl
Yesterday, I caught the end of a Bayesian statistics meeting, celebrating Steve Feinberg's 65th birthday.


DeGroot Lecture, given by Larry Brown:

Goal: to be able to predict a male's height given the height of a female sibling. The data is a collection of families where there is at least 1 son and at least 1 daughter.

* linear regression on all pairs kinda works, but not "clean", because there are dependencies between pairs (same brother, same sister can appear in different pairs).
* randomly picking one pair per family is clean, but throws out data.
* I think he explained the RightWay at some point, but I didn't get it. My intuitive solution is to average all males and all female, and take that as one pair, weighted appropriately (#males * #females ?).

People I spoke to:


* Elly Kaizar
* Cosma Shalizi
* Edoardo Airoldi
* K Sham
* Adrian Dobra
* Pantelis Vlachos
* some people from Sheffield
* Brad Carlin

Statistics terms I understand:


* "dominates" relation between decision procedures (standard game theory definition of "dominates", if you view decision procedures as strategies)
* admissible estimator (i.e. not dominated by anything)
* the risk of a decision procedure (i.e. expected loss)

Statistics terms I don't fully understand:


* shrinkage
* empirical Bayes
* Stein's paradox


About half of the posters were about atmospheric sciences. There was a reasonable number on text statistics. My favorite poster was about addressing the problem of MCMC getting stuck when posteriors are multimodal. Someone told me that most of the money for statisticians today comes from FDA and bio-statistics.


All these papers and posters have so many symbols! I think it's possible to come up with easier-to-understand notations for statistics. Maybe some conventions about types, (e.g. estimators must be in italic). I wish Gerry Sussman would write a book titled "Structure and Interpretation of Mathematical Statistics".


After dinner, there were some very entertaining speeches. Paraphrasing Bill Eddy:
"In 1979, I predicted that the religion of Bayesianism would die by the end of the century. It would die not the death of Sodom and Gomorrah, smitten by God, but rather the death of disuse. I'm now pleased to say that I was wrong."

Hearty applause followed.

At the time, Sodom & Gomorrah was Las Fuentes (apparently a conference site near Valencia). At some later point, it was CMU. Nowadays, it is Duke University.


Bayesian Statistics vs Machine Learning: cultural differences

* journal papers (influence of Mathematics) vs conference papers (influence of Computer Science)

* statisticians tend to be more skeptical, and write skeptical papers. Machine Learning has more of an engineering culture, and rely more on empirical evaluations, rather than theoretical challenges. Statistics invites theoretical argumentation, debates. (my impression)

* statisticians are interested in summarizing data (maybe also making it more interpretable?)

* Machine Learning people like to draw graphical models, using plate notation, etc. I was surprised to see one or two posters about hierarchical Bayesian models that had no pictures of graphical models anywhere.


Priors can be:

* proper, improper
* informative, uninformative
* subjective

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags