My advisor
Rens Bod is a computational linguist and the major proponent of a theory of language called "Data-Oriented Parsing" (DOP) (due to
Remko Scha, another colleague of ours) which says that people produce language by reusing chunks they've seen before, whether concrete word-sequences or abstract rules.
While grammars attempt to be minimalistic, DOP proposes "maximalism": a theory of language has to incorporate a learner's whole language experience. It takes a child ~1000 days worth of information to learn a language, and the DOP philosophy is that children are not being informationally inefficient: we
need most of that information in order to teach the same language to a computer. (OTOH, if you take a rule-based grammar + a dictionary, this will always take up less information)
Rens likes to generalize DOP to all kinds of human cognitive artifacts. I claim that all human artifacts can be parsed: language, music, film... and scientific knowledge. The latter is the subject of my thesis. Information produced by non-intelligent processes, OTOH, cannot be parsed. Parsing may be an intelligence universal: beings that handle a lot of information need to be able to organize it somehow. It would be an interesting project to make an algorithm that distinguishes human artifacts from data produced by non-intelligent processes (weather, geological, astronomical) or nature-made designs (plants, animals). I wonder if gzip can tell the difference, since compression is a kind of universal learning.
Michael Tomasello is an influential cognitive scientist who has a complementary view. While we say that maximalism is necessary for language acquisition, he says that it's sufficient (contra Pinker):
( Read more... )(see also his
"Understanding and sharing intentions: The origins of cultural cognition")
For the second time in my life, I am siding against Pinker. And I really like the guy!
(The other time was about logical reasoning, when he promoted Cosmides's "cheater detection" theory. See
here for why)