Model-Free Transformer Framework for 6-DoF Pose Estimation of Textureless Tableware Objects.

Summary: Imagine a robot trying to clear tables at a restaurant. It’s hard for robots to pick up plates, bowls, and cups because they are smooth, plain, and come in many different shapes. Older robot eyes needed patterns, colors, or exact 3D models to know how to grab them. Scientists have built a new "brain" for robots using a computer model called a transformer. Instead of looking for colors, it uses a 3D depth camera to look at the curves and edges of the dishes. It breaks the shape into a grid and figures out exactly how the dish is sitting in space. In tests, the robot was very accurate, only making tiny mistakes of about 3.5 degrees or half an inch. It even worked perfectly on a real robot moving around to collect dishes!

Model-Free Transformer Framework for 6-DoF Pose Estimation of Textureless Tableware Objects.

Tags