Robots in factories are really good at picking up objects they’ve been pre-programmed to handle, but it’s a different story when new objects are thrown into the mix. This mainly because lack of inability in computer vision, a machine’s ability to see objects or images and understand something about them.
Researchers have been trying to instill this skill in robots for a while, but due to many reasons failure always occurs. To overcome this long problem a team at the MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a computer vision system that can identify objects it has never seen before.
The new system empowers robots to not only visually recognize objects, but to also do so so well they can then proceed to accomplish related tasks, all without any previous input. It lets robots inspect random objects, and visually understand them to accomplish specific tasks.
The work could have a big impact on robotics and self-driving cars, helping to make machines that can learn how to act more intelligently in the real world. This approach will let robots better understand and manipulate items and allows them to even pick up a specific object among a clutter of similar objects – a valuable skill for the kinds of machines that companies like Amazon and Walmart use in their warehouses.
“An exciting and important trend is the move in learning-based vision systems from just doing things with images to doing things with three-dimensional objects,” says Josh Tenenbaum, a professor in MIT’s Department of Brain and Cognitive Sciences. “That includes seeing objects in depth and modeling whole solid objects—not just recognizing that this pattern of pixels is a dog or a chair or table.”
It is created by MIT’s team of researchers led by Peter Florence. The study has published on the arXiv preprint server. The team has used a popular machine-learning technique known as generative adversarial modeling to have a computer learn about the properties of three-dimensional space.
The MIT team calls its system “Dense Object Nets” (DON). DON functions by analyzing objects as collections of points on a visual roadmap, a process that allows the system to understand all the object’s components even if it has never seen it before.
It can “see” because it looks at objects as a collection of points that the robot processes to form a three-dimensional “visual roadmap.” That means scientists don’t have to sit there and tediously label the massive datasets that most computer vision systems require.
This standard technique allows DON to autonomously do very specific tasks such as grab an object from just one of its corners or parts, an ability previous systems lacked.
“Many approaches to manipulation can’t identify specific parts of an object across the many orientations that object may encounter,” researcher Lucas Manuelli said in a press release. “For example, existing algorithms would be unable to grasp a mug by its handle, especially if the mug could be in multiple orientations, like upright, or on its side.”
In the future, a more sophisticated version of DON could be used in a variety of settings, such as collecting and sorting objects at warehouses, working in dangerous settings, and performing odd clean-up tasks in homes and offices. Looking ahead, the researchers would like to refine the system such that it’ll know where to grasp onto an object without human intervention.
“DON solves that problem which means that we can now start to build increasingly more complex systems of smart agents that can teach themselves how to recognize and interact with different objects. I believe that Tedrake lab’s results are going to start a new wave of computer vision applications from robotic manipulation and process control to new intelligent automation solutions,” Luchici said.
More in AI :