Dr. Dobb's Journal March 1997: Genetic Algorithms

Genetic Algorithms

By Satinder Singh, Peter Norvig, and David Cohn

Dr. Dobb's Journal March 1997

Figure 6: The program's experience consists of a trajectory through state space. At time step t, the state is s_t and the agent faces a choice of actions. Note the action the agent chooses to execute at step t is a_t. The reward at step t, Reward_t, is a function of s_t and a_t. The next state s_t+1 depends on s_t, a_t, and many random events such as passengers arriving at floors and pushing buttons. Reinforcement learning allows a program to use such a trajectory to incrementally improve its policy.

Back to Article

Copyright © 1997, Dr. Dobb's Journal