Imagine a cup of coffee placed on a table. Now imagine a book partially obscuring the cup. As humans, we always know what a cup of coffee is even if we can’t see the whole thing. But a robot might be confused.
Robots in warehouses and even around our homes have difficulty identifying and picking up items if they are too close together or if a space is cluttered. This is because robots lack what psychologists call “object unity,” or our ability to identify things even when we can’t see them all.
Researchers at the University of Washington have developed a way to teach robots this skill. The method, called THOR for short, allowed an inexpensive robot to identify objects, including a mustard bottle, a can of Pringles and a tennis ball, on a cluttered shelf. In a recent article published in IEEE Transactions on Roboticsthe team demonstrated that THOR outperformed current state-of-the-art models.
UW News contacted lead author Ashis Banerjee, a UW associate professor in the departments of industrial and systems engineering and mechanical engineering, for more details on how robots identify objects and how THOR.
How do robots perceive their environment?
We perceive the world around us through vision, sound, smell, taste and touch. Robots detect their surroundings using one or more types of sensors. Robots “see” things using standard color cameras or more complex stereo or depth cameras. While standard cameras simply record colored and textured images of the environment, stereo and depth cameras also provide information about the distance of objects, just like our eyes.
However, sensors alone cannot enable robots to make “sense” of their environment. Robots need a visual perception system, similar to the visual cortex of the human brain, to process images and detect where all objects are, estimate their orientations, identify what the objects might be, and analyze any text written on them .
Why is it difficult for robots to identify objects in cluttered spaces?
There are two main challenges here. First, there are likely a large number of objects of varying shapes and sizes. This makes it difficult for the robot’s perception system to distinguish between different types of objects. Second, when multiple objects are close to each other, they obstruct the view of other objects. Robots have trouble recognizing objects when they don’t have a full view of them.
Are there any types of objects that are particularly difficult to identify in cluttered spaces?
This largely depends on the objects present. For example, it is difficult to recognize smaller objects if there are a variety of sizes. It is also more difficult to differentiate between objects with similar or identical shapes, such as different types of balls or boxes. Additional challenges arise with soft or squishy objects that can change shape as the robot collects images from different vantage points around the room.
So how does THOR work and why is it better than previous attempts at solving this problem?
THOR is actually the brainchild of lead author Ekta Samani, who conducted this research as a doctoral student at UW. The core of THOR is that it allows the robot to imitate the way we humans know that partially visible objects are not broken or entirely new objects.
THOR does this by using the shape of objects in a scene to create a 3D representation of each object. From there, it uses topology, a field of mathematics that studies connectivity between different parts of objects, to assign each object a “most likely” object class. To do this, it compares its 3D representation to a library of stored representations.
THOR does not rely on training machine learning models with images of cluttered rooms. All it needs are images of each of the different objects themselves. THOR does not require the robot to be equipped with specialized, expensive sensors or processors, and it also works well with conventional cameras.
This means that THOR is very easy to build and, more importantly, easily useful for completely new spaces with varying backgrounds, lighting conditions, object layouts, and degree of clutter. It also performs better than existing 3D pattern recognition methods because its 3D representation of objects is more detailed, allowing objects to be identified in real time.
How could THOR be used?
THOR can be used with any indoor service robot, whether the robot operates in a home, office, store, warehouse or manufacturing plant. In fact, our experimental evaluation demonstrates that THOR is equally effective for warehouse, living room and family room type spaces.
Although THOR is significantly more effective than other existing methods for all kinds of objects in these cluttered spaces, it is most effective at identifying kitchen-style objects, such as a cup or pitcher, which typically have shapes distinctive but regular and moderate size variations. .
And after?
There are several additional issues that need to be resolved, and we are working on some of them. For example, for now THOR only considers the shape of objects, but future versions might also pay attention to other aspects of appearance, like color, texture, or text labels. It is also worth considering how THOR could be used to treat squishy or damaged objects, whose shapes are different from their expected configurations.
Additionally, some spaces may be so cluttered that some objects may not be visible at all. In these scenarios, a robot must be able to decide whether to move to better “see” objects, or if allowed, to move around certain objects to get a better view of obstructed objects.
Last but not least, the robot must be able to handle objects it has never seen before. In these scenarios, the robot should be able to place these objects into a category of “miscellaneous” or “unknown” objects and then ask for help from a human to correctly identify these objects.
More information:
Ekta U. Samani et al, Persistent homology meets object unity: out-of-order object recognition, IEEE Transactions on Robotics (2023). DOI: 10.1109/TRO.2023.3343994
Provided by the University of Washington
Quote:Q&A: Researcher explains how newly developed method can help robots identify objects in cluttered spaces (February 7, 2024) retrieved February 7, 2024 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.