gusl: (Default)
[personal profile] gusl
Today, after theory lunch, I got an impromptu, 5-minute private lesson from Daniel Sleator (of splay-tree fame), on fast matrix multiplication.

I was surprised to learn that you can do better than O(n^3) in the worst case, in general. FWIU, Volker Strassen was the first one to propose make an improvement, sometime in the 60's.

If you want to multiply a pair of 2x2 matrices, you would ordinarily do 8 multiplications (the result matrix has 4 entries, each with 2 terms). But it's possible to do it with 7: basically you multiply sums of matrices, and add to the results of other multiplications to get what you need.

If you're using such a divide-and-conquer approach, the fundamental (information-theoretic?) bound is: how many multiplications are necessary to form a basis from which to compute our answer?

To illustrate with a simpler example, imagine you have two 10-digit binary numbers, x and y. To do the divide-and-conquer, we split each of them into two halves.

Let x = [01010|00111] = [a | b]
In decimal, x = 2^5 a + b
Likewise, y = 2^5 c + d

x*y is going to be a 20-digit number:

x*y = 2^10ac + 2^5 (ad+bc) + bd (note that we can compute the last 5 digits of x*y from b and d alone)

So it looks like we need 4 multiplications: a*c, a*d, b*c, b*d.

But in fact, if we do (a+b)*(c+d) = ac+ad+bc+bd

we don't need to multiply ad and bc, since they can be calculated using only additions with the above and ac and bd (3 multiplications).

At first sight, this may not seem like a very good deal, since additions cost the same as multiplications.

But the point is that somehow, this generalizes to matrix multiplications, whose complexity asymptotically dominates the complexity of matrix addition. Therefore, the fewer multiplications, the more efficient you are in the limit.

The complexity of matrix multiplication is O(n^omega). Omega today is around 2.3, which is what you get by solving a recurrence.

---

Given the kind of problem that this is, I'm surprised these algorithms aren't being discovered by computers. Going to these theory talks, and seeing clever (but hairy) tricks just keeps feeding my desire to automate the discovery of these theorems/algorithms.

The search for algorithms must constrain the hypothesis-space to algorithms that *might* possibly work (do there exist kinds of genetic programming where the search is constrained by logical constraints, like e.g. symmetry?). Given such a constrained search, some empirical testing may still be needed to "debug" the hypothesis, do some parameter-adjusting, etc.

Also, there's no reason we can't feed it some background knowledge that human computer scientists benefit from. For instance, I would to formalize strategies like "divide-and-conquer". It's an interesting problem how to make such a general idea applicable to different kinds of representations / data-structures.

As a cognitive AI person, I feel like it's my duty to relieve mathematicians from the burden of thinking. But in my experience, theory people don't like to hear this.

(no subject)

Date: 2006-11-09 02:17 pm (UTC)
From: [identity profile] jcreed.livejournal.com
Speaking as a theory person, I absolutely would love if computers did more of my current thinking for me, so I could think about more interesting, high-level things. In fact, my current research is towards precisely this goal.

I still can't understand your surprise that algorithm invention isn't yet trivially automated. Seriously, again, if you think it is easy to do, what algorithm do you have in mind that you think would successfully do it? Why are you not spending the time you write livejournal entries instead trying to code up this algorithm? :)

(also, just as a performance note, multiplcations do not at all cost the same amount as additions, generally; I suppose usually now on modern architechtures multiplication and additions of floating point numbers are omparable --- but in the matrix example, multiplying two matrices is basically always going to be harder than adding two matrices, and those are the things you're divide-and-conquering over)

(no subject)

Date: 2006-11-09 02:18 pm (UTC)
From: [identity profile] mdinitz.livejournal.com
It's not that I don't like to hear what you're saying, it's that I'm extremely skeptical that you can do something like this. Algorithms in particular seem like a hard domain to do automated theorem proving on, since sometimes it's not clear at all what the right approach is. Should you try a greedy algorithm? Dynamic programming? LP rounding? A primal-dual scheme? Or even a semidefinite program? The last three are of particular importance in approximation algorithms, and seem fairly difficult to develop automatically. The automatic scheme would also have to be able to find a good LP or SDP, prove that it is a relaxation of the problem, devise a rounding scheme for it, and prove that you don't lose too much in rounding. There's been a ton of research on various problems with interesting LP relaxations to see exactly what the integrality gap of the LP is, and so far computer searches have been pretty much useless for proving integrality gaps (even lower bounds, which just require finding a bad example).

But for what it's worth, Allan Borodin at Toronto has done some really cool work on formalizing greedy algorithms. He hasn't done divide-and-conquer yet, but he might be working on it. The formalization of greedy algorithms is pretty cool, and can be used to prove theorems of the form "no greedy algorithm can give an aproximation ratio of less than alpha" for various interesting problems (facility location and set cover spring to mind). But the point of his formalization is to go the other way -- instead of using it to find good greedy algorithms, he uses it to prove that greedy algorithms can't work well.

(no subject)

Date: 2006-11-09 02:38 pm (UTC)
From: [identity profile] simrob.livejournal.com
Has the subject of AI-completeness come up (have I raised it?) Because at some point, I think that this is possible, but exactly as hard as making a boring but completely intelligent human. If nothing else, the computer has to be able to pick up on subtelties in the way the statement of the program was written and adapt intelligently to those to do what you're really imagining.

On the other hand, I can more easily imagine a "simulated undergraduate" - Mike is working on a problem and tells his compuer not to "think about solving this problem" but "think about using dynamic programming for this problem" for the afternoon - just like there's apparently a "think about using a greedy algorithm for this problem" computer.

I hazard that this particular AI-building strategy would be more useful to Jcreed and I than Mike - for our field, the query we'd be sending the simulated undergraduate is almost always "Induct over about a billion things to try and show this induction goes through, and if you always get stuck describe that to me." Then, we could do the more interesting things that we usually have to hold off on doing until we make the darn inductive proof go through. (Unsuprisingly, there already are a group of provers that do this.)

Do you think there's a way to your goal without going through this "write lots of robotic undergraduates to solve increacingly interesting groups of problems" phase?

(no subject)

Date: 2006-11-09 02:46 pm (UTC)
From: [identity profile] mdinitz.livejournal.com
AI-completeness is a tricky thing. It's basically the idea behind CAPTCHAs -- use problems that humans can solve but is hard for AI to distinguish humans from computers. But as far as I know, no one has made any progress towards formalizing any kind of AI-completeness.

I should also point out that Borodin's characterization of greedy algorithms so far has not led to a greedy-algorithm finding technique, i.e. as far as I know there's no "think about using a greedy algorithm computer". From what I remember, his formalization is a machine-like restriction, in that he basically states how greedy algorithms have to operate, then shows that all algorithms that we normally think of as greedy can be modified to work like this. A formalization like this doesn't necessarily lead to being able to find such algorithms -- look at turing machines for example. We have a perfectly good formalization of TMs, but we still can't automatically find TMs to do what we want.

(no subject)

Date: 2006-11-09 04:42 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
but in the matrix example, multiplying two matrices is basically always going to be harder than adding two matrices


This is exactly my point. This is why minimizing multiplications creates asymptotically better algorithms.

(no subject)

Date: 2006-11-09 06:03 pm (UTC)
From: [identity profile] simrob.livejournal.com
AI-completeness, I've always assumed, is more of a "talking point" than a "things you formalize" - although with Luis and Sam formalizing "fun" the last time I checked, maybe this is no longer an accurate assumption.

And your point is well taken - at most, the formal characterization does is defines a problem space that you could then write search tactics in. Assuming someone figured out how to define these tactics, they wouldn't have to be complete, just as provers that try to write inductive proofs aren't complete (just as the strategy of "have humans work to find inductive proofs and greedy algorithms" is not complete).

(no subject)

Date: 2006-11-09 06:41 pm (UTC)
From: [identity profile] brkvw.livejournal.com
Now you are talking my language.
But it may be possible to do much of this with brute attacks.



(no subject)

Date: 2006-11-09 06:47 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
what are you referring to?

A brute-force approach can in principle do anything: just search through all possible algorithms. For each one, just search through all possible proofs. Once you find a proof for an algorithm, the problem is solved.

But in practice, such an approach is disastrous. Intelligence boils down to effective search strategies, i.e. strategies that do a good job of pruning the search space: pruning a lot of unproductive paths while keeping enough of the productive ones.

(no subject)

Date: 2006-11-09 07:51 pm (UTC)
From: [personal profile] neelk
Linky link to fun formalization, kplzthx?

(no subject)

Date: 2006-11-09 09:53 pm (UTC)
From: [identity profile] simrob.livejournal.com
Mike, do you have more information - Borodin's webpage appears to be here, and the first linked paper seems to be on this subject, but I only glanced at the first page - here.

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags