A robotic dog picking up a tennis ball from a lawn. Credit: Yuchen Song/UC San Diego.
Quadruped robots incorporating manipulators could potentially perform tasks involving manipulating objects while moving quickly through their environment. These include tasks such as picking up trash around the house, collecting specific objects and bringing them to humans, or dropping target objects at specific locations.
Many approaches designed to train robots to successfully perform these tasks rely on imitation learning. This means that the algorithms planning the robots’ actions learn policies that would allow the robot to accomplish a task by processing demonstration data showing how the agents accomplished that task.
Although some existing methods for training robots in tasks involving both locomotion and object manipulation have shown promising results in simulations, they often do not work as well “in the wild.” This essentially means that they do not allow robots to generalize various tasks well when tested in real-world environments.
Researchers at UC San Diego recently introduced WildLMa, a new framework that could improve the long-term loco-manipulation skills of quadruped robots in the wild. This framework, described in a document on arXiv preprint server, has three components that can collectively enhance the generalizability of skills acquired via imitation learning.
“Rapid advances in imitation learning have enabled robots to learn from human demonstrations,” Yuchen Song, author of the paper, told Tech Xplore.
“However, these systems often focus on isolated, specific skills and have difficulty adapting to new environments. Our work aims to overcome this limitation by training robots to learn generalizable skills using models of vision-language (VLM), then leveraging large language models (LLM) to chain these skills into sequences allowing robots to accomplish complex tasks.
WildLMa, the framework designed by Song and colleagues, first provides a simple way to collect expert demonstration data. This is achieved through a virtual reality (VR)-based teleoperation system, in which human agents can leverage pre-trained robot control algorithms and use a single hand to control all movements of the robot’s body.
“These pre-trained skills are then enhanced by LLMs, which break down complex tasks into manageable steps, similar to how a human might approach a challenge (e.g. ‘choose, navigate, position’),” explained Song. “The result is a robot capable of performing long, multi-step tasks efficiently and intuitively.”
A characteristic feature of the approach introduced by this team of researchers is that it also integrates attention mechanisms. These mechanisms allow robots to focus on a target object while they perform specific tasks.
“Integrating attention mechanisms plays a critical role in making the robot’s skills more adaptable and generalizable,” Song said. “Potential applications for WildLMa include practical household tasks, such as putting away or retrieving items. We have already demonstrated some of these capabilities.”
Song and his colleagues had previously demonstrated the potential of their framework in a series of real-world experiments, during which they successfully trained a four-legged robot to perform various tasks. These tasks included cleaning up trash in UC San Diego’s hallways and outdoor areas, collecting food deliveries, and rearranging items on a shelf.
“Although our system works well, it can still be affected by unexpected disruptions, such as people moving,” Song added. “Our next steps will be to make the system more robust in dynamic environments. Ultimately, our goal is to create affordable home assistant robots that are accessible to everyone.”
More information:
Ri-Zhao Qiu et al, WildLMa: Long-horizon locomotive manipulation in nature, arXiv (2024). DOI: 10.48550/arxiv.2411.15131
More videos available here:
arXiv
© 2024 Science X Network
Quote: Imitation learning framework improves loco-manipulation skills of quadruped robots in nature (December 6, 2024) retrieved December 7, 2024 from
This document is subject to copyright. Except for fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.