Advancing Robot Perception for Smarter, More Adaptive Machines

Robots have come a long way since the Roomba. Today, drones make home deliveries, self-driving cars are on some roads, robot dogs help first responders, and more robots are turning things around and helping out in factories. But Luca Carlon believes the best is yet to come.

Carlone, recently appointed associate professor in MIT’s Department of Aeronautics and Astronautics (AeroAstro), leads the SPARK Lab, where he and his students are bridging a crucial gap between humans and robots: cognition. The team conducts theoretical and experimental research to bring robots’ understanding of their environment closer to that of humans. As Carlone often says, cognition is not just about discovery.

While robots have made great strides in their ability to sense and recognize their surroundings, there is still much to learn in terms of higher-level understanding of their environment. As humans, we intuitively perceive objects not only by their shape and labels, but also by their physics—how they can be manipulated and moved, and how they relate to each other, to the larger environment, and to ourselves.

Carlon and his team hope to give robots human-like cognition and the ability to interact safely and seamlessly with people in homes, workplaces, and other unstructured environments.

Since joining the MIT faculty in 2017, Carlon has led his team in developing and implementing algorithms for recognizing and understanding context in a wide range of applications, including autonomous underwater search and rescue vehicles, drones that can pick up and manipulate objects while flying, and self-driving cars. They could also be useful for home robots that can follow natural language commands and predict human needs based on high-level contextual features.

“Understanding is a big barrier to using robots to help us in the real world,” Carlon said. “If we can add cognitive and thinking elements to robot cognition, I believe they will bring many benefits.”

Broaden your horizons.

Carlone was born and raised near Salerno, Italy, on the Amalfi Coast, the youngest of three siblings. His mother was a retired elementary school teacher and math teacher, and his father was a history professor and publisher, always analytical in his historical research. All three brothers became engineers, and it is possible that they unconsciously followed their parents’ thinking – the older two studied electronics and mechanical engineering, while Carlone was studying robotics or mechatronics at the time.

However, he began pursuing this field while still in college. Carlon studied at the Polytechnic University of Turin, initially focusing on theoretical work, especially control theory, the field that uses mathematics to develop algorithms that automatically control the behavior of physical systems such as power grids, aircraft, cars, and robots. Then, in his senior year, Carlon enrolled in a robotics course, studying advances in manipulators and how to move and operate robots.

“It was love at first sight. “Developing a robot’s brain, using algorithms and mathematics to make it move and interact with its environment is the most satisfying experience,” Carlon said. “I immediately decided that’s what I wanted to do with my life.”

He continued his studies at the Polytechnic University of Turin and the Polytechnic University of Milan, where he earned a master’s degree in mechatronics engineering and automation engineering. As part of this program, called the Alta Scuola Politecnica, Carlone took management courses, where he and students from various academic backgrounds had to collaborate on the design, development, and marketing of new products. Carlone’s team created a touchless desk lamp that responded to the user’s hand gestures. The project inspired him to think about engineering in a variety of ways.

“It’s like speaking a different language,” he said. “It was an early recognition of the need to look beyond the tech bubble and think about how to create tech jobs that can impact the real world.”

Next generation

Carlon remained in Turin to pursue his PhD in Mechatronics. During this time, he was free to choose his thesis topic, which he did, as he recalls, “in a somewhat naive way.”

“I was studying a topic that the public thought was well understood, and many researchers felt there was nothing to say,” Carlon said. “I underestimated how popular the topic was and thought I could add something new to it, and I was lucky to do that.”

The topic at hand is “simultaneous localization and mapping,” or SLAM, the problem of simultaneously creating and updating a map of a robot’s environment while simultaneously tracking the robot’s position in that environment. Carlone found a way to reformulate the problem so that algorithms could create more accurate maps without the initial assumptions that most SLAM methods at the time made. His work helped open up a field that most roboticists thought no one could do better than existing algorithms.

“SLAM is about figuring out the shapes of objects and how the robot moves between them,” Carlon said. “I’m now part of a community that’s wondering, ‘What’s the next generation of SLAM?’”

Seeking answers, he accepted a postdoctoral position at Georgia Tech, studying coding and computer vision, a field that, in retrospect, may have stemmed from near-blindness: While pursuing his PhD in Italy, he suffered a health complication that seriously affected his vision.

“I could easily lose my eyesight in a year,” Carlon said. “It’s something that made me think about vision and the importance of artificial vision.”

He received good medical care and fully recovered, allowing him to continue working. At Georgia Tech, his advisor, Frank Dellaert, taught him how to code computer vision and create elegant mathematical representations of complex, three-dimensional problems. His advisor was one of the original developers of the open-source SLAM library GTSAM, and Carlon quickly realized that it was an invaluable resource. In his broadest sense, he believes that making software accessible to everyone opens up a huge opportunity for advances in robotics.

“Traditionally, progress in SLAM has been very slow because everyone owns their own code and each team has to essentially start from scratch,” Carlon said. “Then the open source pipeline started to emerge, and that was the turning point that largely contributed to the progress we’ve seen over the last 10 years.”

Spatial AI

After graduating from Georgia Tech, Carlon came to MIT in 2015 as a postdoctoral fellow in the Laboratory for Information and Decision Systems (LIDS). During that time, he worked with Sertak Karaman, professor of aeronautics and astronautics, on developing software to help palm-sized drones navigate their environments with very little effort. A year later, he was promoted to research scientist, and in 2017, Carlon accepted a faculty position at AeroAstro.

“One of the things I love about MIT is that every decision is about: What are our values? What is our mission? It’s never about the bottom line. The real motivation is about how to improve society,” Carlon says. “It’s very refreshing to have that mindset.”

Today, Carlon’s team is developing ways for robots to not only represent geometric shapes and meanings, but also to visualize their environment. He is using deep learning and large language models to develop algorithms that allow robots to perceive their environment through a higher-level lens. Over the past six years, his lab has produced more than 60 open-source repositories, which are used by thousands of researchers and practitioners around the world. Much of his work falls within a growing field called “spatial artificial intelligence.”

“Spatial AI is like SLAM on steroids,” Carlon said. “In short, it’s about enabling robots to think and understand the world like humans.”

It’s a major project with far-reaching implications, enabling more interactive and responsive robots at home, at work, on the road, and in remote and dangerous areas. Carlon says there’s a lot of work ahead to get closer to how humans perceive the world.

“I have twin 2-year-old girls, and I’ve seen them manipulate objects, carry 10 different toys at once, navigate messy rooms with ease, and adapt quickly to new environments. “Robot cognition is nowhere near what a toddler can do,” says Carlon. “But we have new tools in our arsenal. And the future is bright.”

Advancing Robot Perception for Smarter, More Adaptive Machines | MIT News

Để lại một bình luận Hủy

Để lại một bình luận Hủy

Tin Tức Liên Quan