gusl: (Default)
[personal profile] gusl
Is there a name for the following gradient ascent algorithm?

while(L2norm(increment) > epsilon){
    grad <- computeGrad(f,state)
    increment <- grad*stepSize
    state <- state + increment
    newObj <- f(state) ##objective
    if (newObj < oldObj) { ## in case the objective degrades: go back, decrease step size
      state <- oldState
      stepSize <- stepSize*0.9
    } else {               ##otherwise, keep going
      oldObj <- newObj
      oldState <- state
    }
  }
  return(state)


The trick of shrinking the step size only when the objective degrades was due to [livejournal.com profile] random_walker. (hm, maybe I should use 0.5 rather than 0.9)

Although my epsilon is 1e-10 (for both the gradient computation and the above optimization algorithm), different starting points give answers within 1e-3 of each other... I can't see why this would happen... if a path has reached an increment size of 1e-10, that should mean that steps larger than that were overshooting, and therefore the true local maximum can't be further than 1e-10 away. Where am I going wrong?

---

I'm finding that this algorithm often keeps moving in the same direction for a long time, a sign that the step size has gotten too small... so I'm thinking I should also increase the step size whenever the objective is not degraded, only not as fast as the decrease that happens when it is degraded. e.g. multiply by 0.5 when it's degraded, multiply by 1.2 otherwise. If we have smoothness then near the optimum we should see degradation about half the time, so this way we still get convergence.

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags