RL Notes 5: Augmented Random Search

- 1 min

Previous week’s notes

(You can find the notebook teaching a 2D robot to walk here)

Suppose we want to teach a robot how to walk. Each time-step, we have to tell it how much to rotate each joint, with what velocity, etc. In other words, we have to give it a vector that controls its joint movements. Each moment we also receive some information from the environment: where our robot is at, what speed it is going, etc.

A shallow neural net is a simple architecture that maps the input vector to the necessary outputs. However, we have to choose appropriate weights.

Normally we train a neural net with gradient descent. But gradient descent may not always be possible: we might not have a gradient of our loss function, or it may simply be computationally expensive. Augmented random search is an alternative.

The basic algorithm for augmented random search is similar to finite differences:

  1. Create a random perturbation of the same shape as our parameter matrix (small positive or negative random amounts).
  2. Make two copies of our parameter matrix, one in which we add , on in which we subtract it (resulting in and ).
  3. Simulate our agent with these two new matrices, and record the rewards ( and ).
  4. Update our parameter matrix by , where is the learning rate.

The algorithm is quite simple. To make it more effective, we can take a couple of additional steps:

Krisztian Kovacs

Krisztian Kovacs

Learning to teach computers to learn.

comments powered by Disqus
rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora