Markov Chains are everywhere
Apr. 7th, 2009 11:31 pmMy lab printer tells me there's a concept called Semi-Markov Chain.
---
There's also something called a Controlled Markov Chain, in which the transition matrix is evolving over time. If it converges to a point, I think the MC will converge to a stationary distribution under the usual conditions. This comes up in Hinton-style neural networks (Google "contrastive divergence").
---
Stochastic Local Search is a very general type of optimization heuristic (the term seems to be more common for discrete problems, such as SAT). These processes are (usually) Markov Chains. In many cases, "Random Restarting" chooses a point based on where you left off from, so you're still in a Markov Chain. It's just another kind of step. I'm imagining a hierarchy of Restarting steps.
I would like to frame Local Search meta-optimization in terms of reinforcement learning: mapping explore/exploit actions to (possibly delayed) rewards. (I'm taking a class on Empirical Algorithmics, which tends to inspire such ideas in me)
The point is that these exploration algorithms are making complex decisions, similar to what you'd want to model with an MDP (I really should say POMDP). Most algorithms correspond to an agent with a static policy.
(This may have been unconsciously inspired by this)
---
There's also something called a Controlled Markov Chain, in which the transition matrix is evolving over time. If it converges to a point, I think the MC will converge to a stationary distribution under the usual conditions. This comes up in Hinton-style neural networks (Google "contrastive divergence").
---
Stochastic Local Search is a very general type of optimization heuristic (the term seems to be more common for discrete problems, such as SAT). These processes are (usually) Markov Chains. In many cases, "Random Restarting" chooses a point based on where you left off from, so you're still in a Markov Chain. It's just another kind of step. I'm imagining a hierarchy of Restarting steps.
I would like to frame Local Search meta-optimization in terms of reinforcement learning: mapping explore/exploit actions to (possibly delayed) rewards. (I'm taking a class on Empirical Algorithmics, which tends to inspire such ideas in me)
The point is that these exploration algorithms are making complex decisions, similar to what you'd want to model with an MDP (I really should say POMDP). Most algorithms correspond to an agent with a static policy.
(This may have been unconsciously inspired by this)