Marty Learns to Swing – Part 2

In Part 1 of this series I made a swing for Marty the Robot and timed how long it took for him to complete a cycle. It probably won’t come as a surprise (when you consider the title of this series) that my ultimate plan is for Marty to keep the swing going on his own by pumping at appropriate time – just as you would on a playground swing.

There are going to be a few challenges getting Marty to learn to swing so I decided that it would be a good idea to test out some of these on a computer model of Marty swinging rather than try everything on the physical setup described in Part 1. It is very common to test ideas on a model and there are lots of modelling frameworks to make it simpler to build computer models. So (while taking a break from very hot weather during some days of holiday) I have been working on modelling Marty’s swing using the OpenAI Gym environment which will make it easier to evaluate swinging strategies in the future – and that will make it easier to help Marty learn to swing.

Modelling Marty on a Swing

The pendulum animation at the start of Part 1 is a kind of model – as are all computer-based games and animations. That particular animation used knowledge of the oscillatory way a pendulum swings to pretend that it was drawing the motion of a real pendulum. Like all models though, it is flawed in some ways. It does a good job when the angle of swing is very small because then a pendulum’s motion follows a sine curve and this fact will be used further in Part 3. But the main issue with our existing animation is that it doesn’t handle external influences like pushes or pumping the swing.

Now that we are thinking about keeping a swing going we need to make a better model because we are going to need to take account of what happens when Marty pumps the swing. And that means we’re going need to understand the physics of riding a playground swing a bit more directly.

The Energy of a Swing

A very good way to understand what is happening with any swing is to consider the energy of the “closed-system”. In this case the word “system” refers to Marty, the rod & strings he’s suspended from and the air around him as he moves. The total energy in a closed-system remains constant as energy cannot be created or destroyed and it can’t move around because our system is “closed”. Marty’s swing isn’t really a perfectly closed-system as air is free to move around and be replaced by other surrounding air but it is sufficiently good for us to model the swing to a reasonable degree.

The kinds of energy present in a Marty-swing are:

Potential Energy – when Marty swings high he has more potential energy as he is further “up” in the earth’s gravitational field
Kinetic Energy – the energy of Marty’s motion and this is related to how fast Marty is moving
Heat (thermal) and other energy – as Marty swings through the air energy is transferred to air molecules – but we’re going to ignore this and other “loss” effects as we think these are relatively small (based on the fact that it takes a while for Marty to stop altogether if we just leave him swinging)

So what happens to these two main kinds of energy (Potential and Kinetic) when Marty swings? If the GIF has stopped playing then try refreshing the page or clicking on it to get it going again …

Let’s start by imagining Marty being held at the start of his swing – this is the case at time 0 (the left of the x axis). He has no motion so his Kinetic Energy (blue curve) is zero. But he is at his highest point (unless he pumps or you push him later on) so his Potential Energy (green curve) is at its highest. We’re ignoring the air-resistance that would slow Marty down by converting some of his energy to heat.
Now we let go of Marty and he starts to move downwards in an arc and, in moving downwards, he is losing some Potential Energy. So if the total energy in our closed-system can’t change then where does lost Potential Energy energy go? The answer is that it is converted to Kinetic Energy at exactly the same rate that Potential Energy is reduced. And the amount of Kinetic Energy at any time is dependent on the speed that Marty is moving at.
As Marty reaches the bottom of his swing (around t = 0.35) the Kinetic Energy is at its maximum and Potential Energy at its minimum. This is the point where Marty is going fastest and, as he starts upwards again the cycle of energy conversion reverses and some of his speed is lost as Kinetic Energy is converted back to Potential Energy.
This repeats over and over again as you can see in the animation.

Simulating a Marty Swing

As mentioned I chose the OpenAI Gym environment for my simulation. The beauty of this is that it is used by a large number of people who develop and evaluate machine-learning algorithms. So if I were to consider doing an insane project to let an Artificial Intelligence swing Marty a few thousand times in order to learn how to swing – full disclosure: I’m considering it! – then this would be a good way to start 🙂

The OpenAI Gym has a very simple interface and it proved to be pretty easy to create a new gym. There is more explanation of making a gym in this post. My new gym code has a folder structure like this:

├── gym_martyswing
|   ├── gym_martyswing
|   |   ├── envs
|   |   |   __init__.py
|   |   |   martyswing_env.py
|   |   __init__.py
|   setup.py

The main code in the MartySwing_v0 gym handles the step() function which moves the simulation on by one time-step. In our gym we define a time-step as 0.05 seconds and this was chosen so that the calculations remain reasonably accurate (at each time step we are approximating the motion of the swing by adding the current acceleration (at a tangent to the pivot of the swing) to the velocity that Marty is currently swinging at. This would be very inaccurate if we stepped forwards in time by a large amount (say 1 second) at each time step because by this time Marty would have swung through a large angle and the acceleration would have changed by a significant amount. Conversely, if we made the time-step very small, each simulation would take longer to run.

The code which handles the velocity update is as follows:

        # Update tangential velocity based on acceleration
        tangentialAcc = self.g * np.sin(self.theta)
        newV = self.v - tangentialAcc * self.dt

        # Calculate arc-angle traversed at current v in time dt
        arcLen = newV * self.dt
        thetaDiff = arcLen / self.l
        newUpdated Gym and animation codeTheta = self.theta + thetaDiff

In addition each step() takes an action defined by the controlling code. There are two possible actions that the MartySwing gym supports:

Kick
Straight

At each step() the action is passed as an argument and the gym assumes that Marty’s legs are “Straight” initially.

Increasing MartySwing Gym Energy by Pumping

In the step() code the MartySwing gym code models Marty’s motion by making the pendulum length shorter when a Kick action is chosen and reverting to the normal length with a Straighten action. Referring back to Part 1 of this series, we found that a longer pendulum generates a higher speed of motion at the lowest (central) point of the swing. Conversely, for a pendulum moving at a given central speed, a shorter pendulum will take less time to reach its highest point.

So if we combine these two observations we can see that a Straight action near to the highest point of motion will make the Marty-pendulum as long as possible and he will gain speed at the highest rate possible. If we then Kick close to the middle of the motion we will convert this speed to height at the fastest rate possible – hence gaining more height than we had at the other extreme of motion.

Repeating this over and over will cause Marty to swing higher and higher at each opportunity.

If we consider the energy it goes like this:

So we can see that on each swing we add some energy to the system. The energy added comes from work done by Marty lifting his legs (using his motors for a real-world Marty). Marty is working against the force of gravity and he does that at the bottom of his swing – where it is hardest to do – and straightens his legs when he is at an angle – when it is not so hard to do. To understand this better consider how hard it would be to lift your own legs up to your chest if you were hanging from a bar compared with bringing your legs up to your chest when lying on your side. And then think of the extreme case when Marty is swinging to 90 degrees. Bending his legs doesn’t involve (much) work against gravity at this point as the movement is (mainly) horizontal.

The model we are using isn’t quite right because when a real Marty kicks it will move his centre of mass (sometimes called centre of gravity) and we don’t take this into account. My assumption is that this isn’t a significant issue but I don’t really have a way to test this. While we’re on the subject of things that aren’t quite right about the model, another thing is that when Marty kicks the full motion is completed within one time-step of the step() function. This is currently 0.05 seconds or 50mS and that wouldn’t be enough time for the real-world Marty’s motors to act. There are no doubt many other flaws in the model that I haven’t even considered but it does a good enough job to get a better understanding of the way pumping a swing might work and what Marty might need to do in order to keep the real swing going.

Testing the MartySwing Gym

In my test program called martySwingGymTest.py there are a hard-coded rules about what action to take as follows:

Kick if we are near the middle of a swing and haven’t kicked already
Straight if we are near the top of our swing and we are not not straight-legged already
Repeat the previous action in any other case

This results in the swinging behavior shown in the animation at the top of this post.

import gym
import gym_martyswing
import time
import numpy as np

# Create the MartySwing
env = gym.make('MartySwing-v0')

# First action is to remain straight-legged
nextAction = 1

# Reset the Gym
observation = env.reset()

# Go through a number of swings
while(True):
    # This display the MartySwing in a window
    env.render()

    # Take the next action
    observation, reward, done, info = env.step(nextAction)

    # Select a new action based on the position in the swing cycle
    if (info["theta"] < np.radians(5) and info["theta"] > np.radians(-5)) and info["kickAngle"] < 0.01:
        # Kick if we are near the middle and haven't kicked already
        nextAction = 0
    elif (info["theta"] < np.radians(-29) or info["theta"] > np.radians(29)) and info["kickAngle"] > 0.01:
        # Straighten if we are near the top of our swing and not straight-legged already
        nextAction = 1

    # Check if we completed this test
    if done:
        print("Test after {} secs".format(info["t"]))
        break

    # A short delay to make the display look "normal"
    time.sleep(.1)

# Close the swing environment
env.close()

The animations were made with similar code in the files:

martySwingGymEnergy.py
martySwingGymAngle.py

All of the code is on the GitHub repository.

So have fun playing with it!