Last week, car companies from around the world lined up at the Consumer Electronics Show (CES) in Las Vegas to announce their latest technology and investment in autonomous driving.
General Motors said that it would spend $500 million with car-hailing service Lyft to build “an integrated network of on-demand autonomous vehicles in the US”. Toyota is building a new research institute to work on autonomy, while Audi, BMW, Ford and others also announced progress in their efforts to develop driverless cars. Mercedes has developed a self-driving research vehicle (pictured) and taxi app Uber has already announced billions of dollars in autonomous car investment.
To operate safely, driverless cars need to be able to sense obstacles and hazards, such as other cars and pedestrians. They do this using a range of sensors, but to pick out hazards effectively, the car must be able to separate them from the background. This means the car needs a detailed, static picture of the world with which to compare what it sees – in other words, a map.
You might think this would give Google an edge – it already has a vast set of maps and the infrastructure for updating them. Its prototype autonomous cars have driven farther than any other company’s offering, even on some public roads in California and Texas. But Google has a way to go: the digital maps the company does so well must be augmented with more data if they are to guide autonomous cars.
At the moment, Google doesn’t have much of this extra data and neither does anyone else. But now, car manufacturers are banding together to make their own maps, and it looks as if they may be in a better position to do so than the search giant.
Raj Rajkumar of Carnegie Mellon University in Pittsburgh, Pennsylvania, identifies multiple ways in which companies can build maps for their robot cars. The first is Google’s “do everything” approach: the company controls its entire driverless car operations, gathering the map data itself and processing it for the intelligent software that drives its cars. Google could use Street View to get its map data, collected by a dedicated fleet of cars. But it’s expensive and impractical to run Street View cars for the sole purpose of repeatedly scanning roads to keep maps up-to-date. For example, Street View has mapped a busy street in London, the A201 next to St Paul’s Cathedral, seven times since it began in 2008. But the street I live on, in a Greenwich suburb, has only ever been mapped once, last year.
Another is the approach taken by Here Maps, a former division of Nokia. Like Google, Here drives its own mapping cars around cities and motorways. But Here aims to license its data to car manufacturers that are planning to build autonomy into their vehicles.
Here was recently bought by a consortium of big car-makers – and if that wasn’t worrying enough for Google, the companies have another trick up their sleeves. They plan to use the sensors on new car models to collectively gather mapping data, rather than having an expensive fleet of dedicated vehicles.
Announcing their acquisition of Here Maps last year, Daimler, Volkswagen and BMW stated that “the high-precision cameras and sensors installed in modern cars are the digital eyes for updating mobility data and maps”.
“For the automotive industry this is the basis for new assistance systems and ultimately fully autonomous driving,” the companies said.
This data will help autonomous cars deal with the unexpected; instead of being stumped by roadworks, cars using Here will have had their maps updated in real time and will know to simply go around.
And the companies working with Here aren’t the only ones with this idea: General Motors is plugging sensors into its new cars to crowdsource data itself. The company has joined forces with an Israeli start-up called Mobileye, which makes software for driverless cars using cameras and image processing, rather than Google’s more expensive lidar system. Tesla also uses Mobileye’s technology in its autonomous driving software, which was recently released to all of its Model S cars through a software update.
General Motors is well placed for crowdsourcing because it has the scale to collect large amounts of data quickly. The company sells several million cars every year, many of which have cameras and networking equipment that will let them contribute to maps. The firm’s prospects for fully autonomous driving improve every minute that its customers drive around.
“Crowdsourcing is the traditional car companies’ very, very big advantage,” says Rajkumar. “There’s an interesting competition ahead.”
It’s an unusual position for Google, whose power has always come from having more data than its competition. “When it comes to information about the physical world that we live in, Google doesn’t really have a presence,” says Rajkumar.
Indeed it is Toyota, not Google, that has the lion’s share of the patents for self-driving cars, according to Thomson Reuters. The Japanese company recently pumped $1 billion into a new research institute dedicated to using artificial intelligence to make cars safer.
In a speech at CES, Gill Pratt, CEO of the new institute, indirectly challenged Google’s achievements so far. “Most of what has been collectively accomplished has been relatively easy because most driving is easy,” he said. “Where we need autonomy to help us is when the driving is difficult. And it’s this hard part that we intend to address.”
Grand Theft Auto to gather simulated data
Spending thousands of hours playing Grand Theft Auto might have questionable benefits for humans, but it could help make computers significantly more intelligent.
Several research groups are now using the hugely popular game, which features fast cars and various nefarious activities, to train algorithms that might enable a self-driving car to navigate a real road.
There’s little chance of a computer learning bad behavior by playing violent computer games. But the stunningly realistic scenery found in Grand Theft Auto and other virtual worlds could help a machine perceive elements of the real world correctly.
A technique known as machine learning is enabling computers to do impressive new things, like identifying faces and recognizing speech as well as a person can. But the approach requires huge quantities of curated data, and it can be challenging and time-consuming to gather enough. The scenery in many games is so fantastically realistic that it can be used to generate data that’s as good as that generated by using real-world imagery.
Some researchers already build 3-D simulations using game engines to generate training data for their algorithms (see “To Get Truly Smart, AI Might Need to Play More Video Games”). However, off-the-shelf computer games, featuring hours of photorealistic imagery, could provide an easier way to gather large quantities of training data.
A team of researchers from Intel Labs and Darmstadt University in Germany has developed a clever way to extract useful training data from Grand Theft Auto.
The researchers created a software layer that sits between the game and a computer’s hardware, automatically classifying different objects in the road scenes shown in the game. This provides the labels that can then be fed to a machine-learning algorithm, allowing it to recognize cars, pedestrians, and other objects shown, either in the game or on a real street. According to a paper posted by the team recently, it would be nearly impossible to have people label all of the scenes with similar detail manually. The researchers also say that real training images can be improved with the addition of some synthetic imagery.
The software scans a road scene and assigns objects label names (on the left panel) such as road, sidewalk, or building.
One of the big challenges in AI is how to slake the thirst for data exhibited by the most powerful machine-learning algorithms. This is especially problematic for real-world tasks like automated driving. It takes thousands of hours to collect real street imagery, and thousands more to label all of those images. It’s also impractical to go through every possible scenario in real life, like crashing a car into a brick wall at a high speed.
“Annotating real-world data is an expensive operation and the current approaches do not scale up easily,” says Alireza Shafaei, a PhD student at the University of British Columbia who recently coauthored a paper showing how video games can be used to train a computer vision system, in some cases as well as real data can. Together with Mark Schmidt, an assistant professor at UBC, Shafaei showed that video games also provide an easy way to vary the environmental conditions found in training data.
“With artificial environments we can effortlessly gather precisely annotated data at a larger scale with a considerable amount of variation in lighting and climate settings,” Shafaei says. “We showed that this synthetic data is almost as good, or sometimes even better, than using real data for training.”