gusl: (Default)
[personal profile] gusl


Let A be a linear space.
Let A_m be the subset of A on which a given non-linear constraint is satisfied (drawn above as a curved 2D surface).

Goal: maximize an objective D within the manifold A_m.

Ideally, we would have a coordinate-system for addressing points in A_m, and optimize as if it were a Euclidean space. But I don't know how to do that.

Still, I can see how I could explore A_m by exploring tangent directions, evaluating D, and optimizing in a fashion similar to how it's done in the standard setting (e.g. to do gradient descent: for any point in A_m, compute the gradient in A, project it onto the tangent plane, and this gives the gradient in A_m... when following this gradient, we will leave A_m slightly, and we'll need to snap back to A_m somehow). I suspect there's room for cleverness in these details.

(no subject)

Date: 2008-12-28 06:30 am (UTC)
From: [identity profile] en-ki.livejournal.com
If A_m is not self-intersecting, you can safely just use the coordinate system of A. Do you need to worry about self-intersecting constraint surfaces?

You can discuss tangent spaces to manifolds independently of an embedding in a Euclidean space by defining tangent vectors as equivalence classes of curves in the manifold. This also avoids the issue you seem to be raising, in which following the gradient tangent vector leaves the surface: instead of following a vector in A, follow a member of the corresponding equivalence class of curves in A_m selected in some reasonable way. Of course, I don't have a reasonable way on tap...

(no subject)

Date: 2008-12-28 07:35 pm (UTC)
From: [identity profile] aristopheles.livejournal.com
This is a really big and general problem that many people have worked on over decades.
A book that I like pretty well is Practical Methods of Optimization (http://www.amazon.com/Practical-Methods-Optimization-R-Fletcher/dp/0471494631) by Fletcher (the "F" in BFGS).
There are descriptions of the main methods in various places around the internet.

What exactly are you trying to optimize? What is the constraint manifold? Sometimes there are tricks for specific kinds of constraints.

Bear in mind that most of the other methods are faster than gradient descent because they somehow take the second derivative into account (e.g. Gauss-Newton, conjugate gradient, BFGS, even Barzilai-Borwein).

There's room for lots of cleverness in the details, but you may have to take advantage of the specifics of your system.

One different trick that works sometimes: suppose your constraint is given by f(x)=0 with f(x)>0 outside of A_m. For example, on the 2-sphere x^2+y^2+z^2=1 you could use f = (x^2+y^2+z^2-1)^2. Then try an unconstrained minimization on D + k*f where k is some positive constant. After it converges, increase k. When k is big enough it should force all the test points to stay close to A_m.

(no subject)

Date: 2008-12-28 08:08 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Thanks.


<< Then try an unconstrained minimization on D + k*f where k is some positive constant. After it converges, increase k. When k is big enough it should force all the test points to stay close to A_m. >>

It seems that all we need to do is penalize distance from the manifold (i.e. penalize big failures to meet the constraint). To overcome the maxima outside, we need big k.

(no subject)

Date: 2008-12-28 08:10 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
it seems wasteful to use the coordinate system of A.

(no subject)

Date: 2008-12-28 08:40 pm (UTC)
From: [identity profile] en-ki.livejournal.com
In that you'd be using n > m coordinates to describe an m-dimensional space? It's wasteful of numbers, but numbers are free. Doing otherwise is costly in human attention.

(no subject)

Date: 2008-12-29 03:20 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Q and R are 2-state Markov Chains.
A is the linear 4D space of joint Markov Chains whose state-space is the Cartesian product of the state-spaces of Q and R and that are compatible with the constraint that Q and R are partial views.
A_m is the subset of A in which the mutual information of the Q and R views is equal to m.
The function I'm trying to optimize is the mixing rate of the joint Markov Chain (Doeblin contraction coefficient).

(no subject)

Date: 2008-12-29 03:28 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
<< In that you'd be using n > m coordinates to describe an m-dimensional space? >>

yes. And computation is not free.

But there are situations where you can avoid the waste (e.g. if the constraint yields a linear subspace)... and to be a bit more general, where the curvature is small, you can use the linear spline idea: approximate the constraint manifold by linear meshes.

(no subject)

Date: 2008-12-29 07:47 pm (UTC)
From: [identity profile] aristopheles.livejournal.com
What is a "partial view"? What is the Q or R view?

(no subject)

Date: 2008-12-29 08:41 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Let Z be our joint MC.
Let the state-space of Q be {a,b}, and of R be {c,d}, then the state-space of Z is {(a,c),(a,d),(b,c),(b,d)}.

Z is in state (x,y) <=> Q is in state x and R is in state y.

We call Q and R "partial views" of Z.
Edited Date: 2008-12-29 08:42 pm (UTC)

(no subject)

Date: 2008-12-29 08:42 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
btw, have we met?

how did you find my blog?

(no subject)

Date: 2008-12-29 09:42 pm (UTC)
From: [identity profile] aristopheles.livejournal.com
You are friends with [livejournal.com profile] qatar.

(no subject)

Date: 2008-12-29 09:49 pm (UTC)
From: [identity profile] aristopheles.livejournal.com
I don't see how this says more than your original definition that A (or Z) is the product of Q and R.
In your original post you said something like "compatible with the constraint that Q and R are partial views of A"
which does not seem to add to the original definition of A.
Or are you saying that the state space is the product, and the transition probabilities have to induce the correct probabilities for Q and R?
Am I missing anything?

(no subject)

Date: 2008-12-29 10:08 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
yes, the transition probabilities of Z have to induce the correct probabilities for Q and R.

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags