Mini-Project 2: Simulated Double Pendulum Control

In this mini-project, students will work in pairs to develop a control function to produce a novel movement or behavior of a simulated two-link arm. Students will then apply parametric optimization to improve the result.

The primary tool will be the Double Pendulum Simulation, a prototype for a two-link arm control simulation with visualization in Max/MSP. Please note you may modify this however you wish, but the result should still represent a plausible dynamic system, e.g., you may not just suppress the calculations of physical dynamics.

Learning Objectives

  1. Tuning PID feedback control gains interactively on a simulation.
  2. Formulating a physical goal as feedback control calculations.
  3. Formulating an optimization objective reward function.
  4. Applying optimization to automatically improve performance of a simulated control system.

Deliverables

  1. In-class demonstration of simulated behavior.
  2. Short blog post outlining goals, outcomes, and final code.

Technical Approaches

The sample code includes several control modes, including PD control of a single pose, a heuristic ‘swing-up’ pumping function, and a optimization-driven PID pose control.

Some suggested objectives you might choose to solve:

  1. Pose to pose. Apply PD control to cycle between a set of fixed poses. Optimize gains to minimize torques.
  2. Trajectory tracking. Apply PD control to move along a prescribed kinematic orbit and optimize gains. The path can be described as a function of time using closed-form expressions or a lookup table.
  3. Giant Swing. Pump the motion until the shoulder is freely revolving while the elbow remains approximately straight; optimize to minimize torques.
  4. Balancing. Drive the arm to a vertical pose using only shoulder torques. (Difficult)
  5. Acrobot. Drive the arm to a vertical pose using only elbow torques. (Expert)

In all cases, the simulator acts like a physical machine by not providing a ‘reset’ method; each optimization trial must conclude by guiding the system back to a initial-condition start state.

Background

Most common definitions of robot include some element of sensing and responding to the world. In many actual robots, this sensing is purely internal in the form of control feedback which regulates movement to achieve a specified position goal, target velocity, or trajectory. This feedback uses proprioceptive sensors to measure joint positions, applies a control model to compute appropriate actuator commands for a known physical system, and then applies force via the actuators. If it works, the mechanism moves closer to the goal state and errors are reduced. If the disturbances are too great, the process can fail to achieve the target or become unstable.

The ideas behind feedback predate robotics: a classic example is the flyball governor, invented in 1788 to regulate steam engine speed. The ideas have been generalized broadly beyond mechanical systems in the study of cybernetics.

A controller ‘closes the loop’ by creating a cyclical energetic and informational pathway including both a process and a representation of that process in the controller. For this exercise, the process will involve a simulated mechanism with torque motors and the representation will involve the idealized system state.

In this view, cause and effect is not well-defined: if the mechanism encounters a disturbance, then the controller will respond with increased torque. Likewise, if the program chooses a new position target, the controller will respond with torque. In equilibrium, both disturbances and goals affect the balance of the system, with the intent of converging in a stable way toward an objective. But the resulting behavior is the composite of both ‘intent’ and ‘reaction’.

One outcome is that even simple feedback-controlled devices can be easily interpreted as ‘alive’ or having ‘purpose’ in the right context, and this effect is what this exercise is intended to explore.

Relevant terms: proportional control, PID control, degree of freedom (DOF), closed-loop control, position target, goal, disturbance, error signal, stability, bandwidth, lag, set point.