University of Southern California researchers have revealed the counter-intuitive result when working with machine learning robots in a simulation. They showed giving robot arms “tough love” by knock objects out of their hands actually helps them better learn how to grasp objects. USC roboticist Professor Stefanos Nikolaidis, coauthor on a new paper describing the work, described the surprise result.
He told Express.co.uk: “What we have observed in our recent work is actually having a human acting in an adversarial manner can help a lot with the learning process.
Having a human acting in an adversarial manner can help a lot with the learning process
Professor Stefanos Nikolaidis
“What this means is you can have a robot try to learn how to pick up objects by itself and the robots will occasionally succeed, then it gets a positive reward, then succeed more and more.”
The researcher explains the two-fold benefit of using simulations.
He said: “You can very quickly benchmark and test different approaches.”
“You can try different algorithms and you can have also multiple users interacting with the robot system, so it’s something you can easily test in a lab setting.
“The other benefit of using simulations, especially if you have a formal computer interface, is you can record them very precisely what actions the users take, and what type of behaviours users have.
So they can then go back and analyse the behaviours.
The new research employed a brand new method of machine learning in the simulations.
The approach traditionally employed is known as learning from demonstration.
A robot, for example, is calibrated and instructed to pick up a cup, and then next time the robot sees the cup, it follows a similar trajectory.
DON’T MISS
NASA supercomputer creates millions of ‘Universes’ [INTERVIEW]
Hubble snaps galaxy ‘like a portal to another dimension’ [PICTURES]
Shadow land: ‘Alien life can exist in 2D universe’ [ANALYSIS]
READ MORE
- AI BREAKTHROUGH: Scientists build ‘self-aware’ robot
Although tremendous progress has been made with this approach, it is less useful when the robot encounters a new environment, due to issues with generalising.
A second approach – learning by experience – was used in the University of Southern California research.
This involved the robot attempting to learn through trial and error.
Professor Nikolaidis said: “A very common in class of approach in this domain is reinforcement learning.
“So the robot tries to pick up something and receives a positive reward if successful and it receives a negative reward if it fails.
“And then they try to reinforce behaviours that receive the positive rewards, like dog training.”
If the robot finds a good grasp, the human uses a graphical interface to click on the object it is gripping and apply a force in a certain direction.
That disturbance basically tests how good the grasp really is, and it helps the robot rule out the less effective ones.
Professor Nikolaidis added: “With these approaches, they don’t scale-up super well, so they need thousands of iterations, thousands of examples of trial and error for the robot to learn to learn something really, really complex and really useful.
“So one way to mitigate that is to have a human in the loop, to have a human provide some type of guidance to the robot system.
“I think it’s important to be realistic in how people may interact with robotic systems and also how the robot can leverage that in the learning process.
“I think what we saw in this study that even if the human is adversary, this can actually be a really good thing.
“Because the human has, especially in manipulation tasks, but not just in manipulation, the human has a very good intuition of how to challenge the robot.”
Professor Nikolaidis believes his research can have applications in the driving domain.
He said: “There has been work on people evaluating driving systems, so using a human to create challenging scenarios, but I think there is a lot of potential in training systems to be safer and more robust just by designing scenarios that challenges the autonomous car, so it can be prepared for these kind of ‘edge’ situations that are hard to capture automatically.”
Source: Read Full Article