# RL Notes 5: Augmented Random Search

- 1 min

Previous week’s notes

(You can find the notebook teaching a 2D robot to walk here)

Suppose we want to teach a robot how to walk. Each time-step, we have to tell it how much to rotate each joint, with what velocity, etc. In other words, we have to give it a vector that controls its joint movements. Each moment we also receive some information from the environment: where our robot is at, what speed it is going, etc.

A shallow neural net is a simple architecture that maps the input vector to the necessary outputs. However, we have to choose appropriate weights.

Normally we train a neural net with gradient descent. But gradient descent may not always be possible: we might not have a gradient of our loss function, or it may simply be computationally expensive. Augmented random search is an alternative.

The basic algorithm for augmented random search is similar to finite differences:

1. Create a random perturbation $\delta$ of the same shape as our parameter matrix $\theta$ (small positive or negative random amounts).
2. Make two copies of our parameter matrix, one in which we add $\delta$, on in which we subtract it (resulting in $\theta^+$ and $\theta^-$).
3. Simulate our agent with these two new matrices, and record the rewards ($r^+$ and $r^-$).
4. Update our parameter matrix by $\theta_{new} = \theta + \alpha(r^+ - r^-)\delta$, where $\alpha$ is the learning rate.

The algorithm is quite simple. To make it more effective, we can take a couple of additional steps:

• Normalizing inputs before feeding it into the neural net.
• Instead of simulating one perturbation, simulating n. Then keeping the top k performing ones and averaging them.