Oxford Robotics Institute - Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion

Oxford Robotics Institute

Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion

PUBLISHED 30 JUN 2022

The Oxford Robotics Institute (ORI) is built from collaborating and integrated groups of researchers, engineers and students all driven to change what robots can do for us. Their current interests are diverse, from flying to grasping - inspection to running - haptics to driving, and exploring to planning. This spectrum of interests leads to researching a broad span of technical topics, including machine learning and AI, computer vision, fabrication, multispectral sensing, perception and systems engineering.

ori logo

Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion

Project Background

Quadruped locomotion is rapidly maturing to a degree where robots now routinely traverse a variety of unstructured terrains. However, while gaits can be varied typically by selecting from a range of pre-computed styles, current planners are unable to vary key gait parameters continuously while the robot is in motion. The synthesis, on-the-fly, of gaits with unexpected operational characteristics or even the blending of dynamic manoeuvres lies beyond the capabilities of the current state-of-the-art. In this case study, ORI address this limitation by learning a latent space capturing the key stance phases of a particular gait, via a generative model trained on a single trot style. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. ORI evaluated its approach on a ANYmal quadruped robot and demonstrated that its method achieved a continuous blend of dynamic trot styles while being robust and reactive to external perturbations.

Project Background Diagram

Quadruped locomotion has advanced significantly in recent years, extending its capability towards applications of significant value to industry and the public domain. Driven primarily by advances in optimisation-based [1-4] and reinforcement learning-based methods [5-7], quadrupeds are now able to traverse over a wide variety of terrains, making them a popular choice for tasks such as inspection, monitoring, search and rescue or goods delivery in difficult, unstructured environments. However, despite recent advances, important limitations remain. Due to the complexity of the system, models used for gait planning and control are often overly simplified and handcrafted for particular gait types such as crawl, trot or gallop [1-8].

Project Approach

Inspired by recent work on a quadruped that achieves a crawl gait via the traversal of a learned latent space9, ORI approached the challenge of continuous contact-schedule variation from the perspective of learning and traversing a structured latent-space. This is enabled by learning a generative model of locomotion data which, in addition to capturing relevant structure in the space, enables the detection and mitigation of disturbances to provide a versatile and robust planning framework. In particular, ORI train a variational auto- encoder (VAE) [10,11] on short sequences of state-space trajectories taken from a single gait type (trot), and predict a set of future states.

VAE Architecture

Figure 2: Using a variational auto-encoder (VAE), the ORI approach learns a structured latent space capturing key stance phases constituting a particular gait. The space is disentangled to a degree such that application of a drive signal to a single dimension of the latent variable induces gait styles which can be seamlessly interpolated between. ORI encodes raw sensor information to infer the robot's gait phase using genc before applying the drive signal and then decode the augmented latent variable and the base twist action ak via gdec and predict the feet in contact using gpp. The drive signal's amplitude and phase provide continuous control over the robot's cadence, full-support duration and foot swing height.

The VAE is fast enough to act as a planner in a closed-loop controller. Thus, the ORI approach can react to external disturbances and mitigate against real-world effects such as unmodelled dynamics and hardware latency. For closed-loop control, ORI began by encoding a history of robot states from the raw sensor measurements to infer the current gait phase. ORI stored a buffer of past robot states to create the encoder's input. This proved able to both detect and react to disturbances, as the VAE is trained using canonical feasible trajectories. Therefore, any disturbances are characterised as out of distribution with respect to the training set. Given the generative nature of this approach, this discrepancy is quantified during operation by the trained model via the Evidence Lower Bound (ELBO).

ELBO Trace Results

Figure 3: The above image depicts the ELBO trace for three push events along with the robot's contact schedule. The widths of the white spaces in the contact schedule halve as the cadence increases to mitigate the disturbance. The robot images above this are snapshots taken from the first push and show the robot's recovery. The robot successfully recovers and this usually requires between three and four steps.

Further information on the robotic experiments and their results can be seen in the below video from ORI.

ORI Robotics Video

Conclusions

ORI presented a robust and flexible approach for locomotion planning via traversal of a structured latent-space, utilising a deep generative model to capture features from locomotion data, and enable detection and mitigation of disturbances. The resulting latent-space is disentangled such that key locomotion features are automatically discovered from a single style of trot gait. This disentanglement is exploited using an oscillatory drive-signal, where the amplitude and phase directly control the gait parameters, namely the cadence, swing height, and full-support duration. Once deployed, the ease with which modulation of the drive signal gives rise to seamless interpolation between gait parameters is demonstrated. Utilising a generative model affords detection of disturbances as out of the distribution seen during training. The VAE-planner is able to reject a wide range of impulses applied to the robot's base. This operating window is enlarged by increasing the robot's cadence once a disturbance is detected - a rudimentary response, which reports that humans increase their cadence to recover from slippage [12].

References

  • C. D. Bellicoso, F. Jenelten, C. Gehring, and M. Hutter, "Dy- namic locomotion through online nonlinear motion optimization for quadrupedal robots," IEEE Robot. Automat. Lett. (RA-L), vol. 3, no. 3, pp. 2261–2268, 2018.
  • C. Mastalli, W. Merkt, J. Marti-Saumell, H. Ferrolho et al., "A direct-indirect hybridization approach to control-limited DDP," arXiv:2010.00411, 2021.
  • O. Melon, R. Orsolino, D. Surovik, M. Geisert et al., "Receding- horizon perceptive trajectory optimization for dynamic legged loco- motion with learned initialization," in IEEE Int. Conf. Rob. Autom. (ICRA), 2021.
  • A. W. Winkler, C. D. Bellicoso, M. Hutter, and J. Buchli, "Gait and trajectory optimization for legged systems through phase-based end- effector parameterization," IEEE Robot. Automat. Lett. (RA-L), vol. 3, no. 3, pp. 1560–1567, July 2018.
  • J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso et al., "Learning agile and dynamic motor skills for legged robots," Science Robotics, vol. 4, no. 26, 2019.
  • S. Gangapurwala, A. Mitchell, and I. Havoutis, "Guided constrained policy optimization for dynamic quadrupedal robot locomotion," IEEE Robot. Automat. Lett. (RA-L), vol. 5, no. 2, pp. 3642–3649, 2020.
  • S.Gangapurwala, M.Geisert,R.Orsolino,M.Fallon,andI.Havoutis, "RLOC: Terrain-aware legged locomotion using reinforcement learn- ing and optimal control," arXiv preprint arXiv:2012.03094, 2020.
  • A. W. Winkler, F. Farshidian, D. Pardo, M. Neunert, and J. Buchli, "Fast trajectory optimization for legged robots using vertex-based ZMP constraints," IEEE Robot. Automat. Lett. (RA-L), vol. 2, no. 4, pp. 2201–2208, Oct 2017.
  • A. L. Mitchell, M. Engelcke, O. Parker Jones, D. Surovik et al., "First steps: Latent-space control with semantic constraints for quadruped locomotion," in IEEE/RSJ Int. Conf. Intell. Rob. Sys. (IROS), 2020, pp. 5343–5350.
  • D. Kingma and M. Welling, "Auto-encoding variational bayes," in Int. Conf. on Learn. Repr. (ICLR), 2014.
  • D. J. Rezende, S. Mohamed, and D. Wierstra, "Stochastic backpropa- gation and approximate inference in deep generative models," in Int. Conf. on Mach. Learn. (ICML), 2014.
  • B. E. Moyer, A. J. Chambers, M. S. Redfern, and R. Cham, "Gait parameters as predictors of slip severity in younger and older adults," Ergonomics, vol. 49, pp. 329–343, 2006.
READ THE FULL WHITEPAPER

The Scan Partnership

Scan has been supporting ORI robotics research as an industrial member since 2020. Scan provides a cluster of NVIDIA DGX and EGX servers and AI-optimised PEAK:AIO NVMe software-defined storage, to further robotic knowledge and accelerate development. This cluster is overlaid with Run:ai cluster management software in order to virtualise the GPU pool across the compute nodes to facilitate maximum utilisation, and to provide a mechanism of scheduling and allocation of ORI workflows’ across the combined GPU resource. Access to this infrastructure is delivered via the Scan Cloud platform, hosted in a secure UK datacentre.

Project Wins

directions_subway

Presentation of locomotion planning using a deep generative data-based model

check_circle

Successful demonstration that increase in cadence is seen to negate slippage like in humans

acute

Time and cost savings generated due to access to GPU-accelerated cluster

Professor Ingmar Posner

Professor Ingmar Posner

Head of the Applied AI Group, ORI

"Using the Scan cluster, we are able to iterate over multiple learned models in parallel on their dedicated Deep Learning hardware. This translates to more time testing on our real robots, and less time waiting for models to train."

Elan Raja

Elan Raja

CEO, Scan

"Being able to support such innovation in the field of robotics makes the Scan team very proud. If our hardware can contribute even a little to shortening the time until these technologies improve human lives, then we see the investment very worthwhile."

Speak to an Expert

You’ve seen how Scan continues to help the Oxford Robotics Institute further its research into the development of truly useful autonomous machines. Contact our expert AI team to discuss your project requirements.

phone_iphone Phone: 01204 474210

mail Email: [email protected]

Read more case studies

We have a large range of studies from many industries

More case studies