The shift to self-drive cars is coming from two directions.
Google is developing a top-down solution. Their cars are designed from the start to be completely autonomous.
Car manufacturers are working on gradual improvements to their existing platforms, so that cars will steadily become more autonomous in certain situations, such as freeway driving, and parking.
The Technology Is Already Here
The New Division of Labor by Frank Levy and William Flew in 2004 put information processing on a spectrum. At one end were simple math tasks that require only application of clear rules. If that was all your job required, then computers have already taken that. But at the other end of that spectrum are jobs that can't be boiled down to simple rules, especially ones which require the human skill of pattern recognition. As an example, they cited a car driving in traffic. Such a job was never going to be taken by a computer in the foreseeable future, they said in 2004.
That was confirmed by the first DARPA (Defence Advanced Research Projects Agency) Grand Challenge: to build a completely autonomous vehicle that could complete a 150-mile course through Mojave Desert. The first run, in 2004, was a debacle. 15 vehicles entered; 2 didn't make it to the start line, one flipped over in the start area, and the "winner" only managed 7 miles out of the 150 before veering off course and into a ditch.
But by 2012 Google's automated cars could safely drive anywhere Google had meticulously mapped the environment, and were licensed for the road in Nevada.
We are at an inflection point - a time when the rules change dramatically.
Effect of exponential growth. Moore's Law - basically, the amount of computing power per dollar is doubling every (year/18 months). Every so often there's a prediction that Moore's Law because of some fundamental physical fact. But "brilliant tinkering" has found ways around the various roadblocks every time.
There has been no time in history where cars or planes got twice as fast or twice as efficient every year, so we have trouble understanding impact of this constant doubling. Best way to explain is the grain of rice on each square of chessboard story.
The chessboard story is impt because the board has two halves. We can get our heads around the numbers on the first half of the chessboard. After 32 squares the emperor had 'only' given away 4 billion grains of rice - about the yield of one large field. But when get into second half of the board, the numbers quickly get weird - way past our comprehension.
.... a computer called ASCI Red, designed to be the first supercomputer to process more than one teraflop. A 'flop' is a floating point operation, i.e. a calculation involving numbers which include decimal points (these are computationally much more demanding than calculations involving binary ones and zeros). A teraflop is a trillion such calculations per second. Once Red was up and running at full speed, by 1997, it really was a specimen. Its power was such that it could process 1.8 teraflops. That's 18 followed by 11 zeros. Red continued to be the most powerful supercomputer in the world until about the end of 2000.
I was playing on Red only yesterday - I wasn't really, but I did have a go on a machine that can process 1.8 teraflops. This Red equivalent is called the PS3: it was launched by Sony in 2005 and went on sale in 2006. Red was only a little smaller than a tennis court, used as much electricity as eight hundred houses, and cost $55 million. The PS3 fits underneath a television, runs off a normal power socket, and you can buy one for under two hundred quid. Within a decade, a computer able to process 1.8 teraflops went from being something that could only be made by the world's richest government for purposes at the furthest reaches of computational possibility, to something a teenager could reasonably expect to find under the Christmas tree.
All the technology is in place - the sensors, the hardware and the computer programs, have all been invented.
They are not yet cheap enough, nor are they reliable enough in less-than-ideal conditions, to be completely autonomous.
But all the hard stuff has been done in the last 10 years. The refining won't take another ten.
How Tech Advances
The sensors need to improve, and this is how it's happening
Superfine visual accuracy for driveless cars:
The new chip uses an established detection and ranging technology called LIDAR (Laser Illuminated Detection And Ranging), used in autonomous vehicles, for example. With LIDAR, a target object is illuminated with scanning laser beams. The reflected light is then analyzed to provide information about the object’s size and its distance from the laser to create an image of its surroundings.
However. the new camera chip has an entire array of tiny LIDARs on a coherent imager, so “we can simultaneously image different parts of an object or a scene without the need for any mechanical movements within the imager,” Hajimiri says.
Hajimiri says the current array of 16 pixels could also be easily scaled up to hundreds of thousands. One day, by creating such vast arrays of these tiny LIDARs, the imager could be applied to a broad range of applications from very precise 3D scanning and printing to helping driverless cars avoid collisions to improving motion sensitivity in superfine human machine interfaces, where the slightest movements of a patient’s eyes and the most minute changes in a patient’s heartbeat can be detected on the fly.
Google's self-driving cars can tour you around the streets of Mountain View, California.
I know this. I rode in one this week. I saw the car's human operator take his hands from the wheel and the computer assume control. "Autodriving," said a woman's voice, and just like that, the car was operating autonomously, changing lanes, obeying traffic lights, monitoring cyclists and pedestrians, making lefts. Even the way the car accelerated out of turns felt right.
It works so well that it is, as The New York Times' John Markoff put it, "boring." The implications, however, are breathtaking.
Perfect, or near-perfect, robotic drivers could cut traffic accidents, expand the carrying capacity of the nation's road infrastructure, and free up commuters to stare at their phones, presumably using Google's many services.
But there's a catch.
Today, you could not take a Google car, set it down in Akron or Orlando or Oakland and expect it to perform as well as it does in Silicon Valley.
Here's why: Google has created a virtual track out of Mountain View.
They're probably best thought of as ultra-precise digitizations of the physical world, all the way down to tiny details like the position and height of every single curb. A normal digital map would show a road intersection; these maps would have a precision measured in inches.
But the "map" goes beyond what any of us know as a map. "Really, [our maps] are any geographic information that we can tell the car in advance to make its job easier," explained Andrew Chatham, the Google self-driving car team's mapping lead.
"We tell it how high the traffic signals are off the ground, the exact position of the curbs, so the car knows where not to drive," he said. "We'd also include information that you can't even see like implied speed limits."
Google has created a virtual world out of the streets their engineers have driven. They pre-load the data for the route into the car's memory before it sets off, so that as it drives, the software knows what to expect.
"Rather than having to figure out what the world looks like and what it means from scratch every time we turn on the software, we tell it what the world is expected to look like when it is empty," Chatham continued. "And then the job of the software is to figure out how the world is different from that expectation. This makes the problem a lot simpler."
While it might make the in-car problem simpler, but it vastly increases the amount of work required for the task. A whole virtual infrastructure needs to be built on top of the road network!
Google has mapped 2,000 miles of road. The US road network has 4 million miles of road. "It is work," Urmson added, shrugging, "but it is not intimidating work."
Very few companies, maybe only Google, could imagine digitizing all the surface streets of the United States as a key part of the solution of self-driving cars. Could any car company imagine that they have that kind of data collection and synthesis as part of their core competency?
Whereas, Chris Urmson, a former Carnegie Mellon professor who runs Google's self-driving car program, oozed confidence when asked about the question of mapping every single street where a Google car might want to operate. "It's one of those things that Google, as a company, has some experience with our Google Maps product and Street View," Urmson said. "We've gone around and we've collected this data so you can have this wonderful experience of visiting places remotely. And it's a very similar kind of capability to the one we use here."
So far, Google has mapped 2,000 miles of road. The US road network has something like 4 million miles of road.
"It is work," Urmson added, shrugging, "but it is not intimidating work." That's the scale at which Google is thinking about this project.
All this makes sense within the broader context of Google's strategy. Google wants to make the physical world legible to robots, just as it had to make the web legible to robots (or spiders, as they were once known) so that they could find what people wanted in the pre-Google Internet of yore.
The more you think about it, the more the goddamn Googleyness of the thing stands out.
In fact, it might be better to stop calling what Google is doing mapping, and come up with a different verb to suggest the radical break they've made with previous ideas of maps. I'd say they're crawling the world, meaning they're making it legible and useful to computers.
Self-driving cars sit perfectly in-between Project Tango—a new effort to "give mobile devices a human-scale understanding of space and motion"—and Google's recent acquisition spree of robotics companies. Tango is about making the "human-scale" world understandable to robots and the robotics companies are about creating the means for taking action in that world.
The more you think about it, the more the goddamn Googleyness of the thing stands out: the ambition, the scale, and the type of solution they've come up with to this very hard problem. What was a nearly intractable "machine vision" problem, one that would require close to human-level comprehension of streets, has become a much, much easier machine vision problem thanks to a massive, unprecedented, unthinkable amount of data collection.
Last fall, Anthony Levandowski, another Googler who works on self-driving cars, went to Nissan for a presentation that immediately devolved into a Q&A with the car company's Silicon Valley team. The Nissan people kept hectoring Levandowski about vehicle-to-vehicle communication, which the company's engineers (and many in the automotive industry) seemed to see as a significant part of the self-driving car solution.
He parried all of their queries with a speed and confidence just short of condescension. "Can we see more if we can use another vehicle's sensors to see ahead?" Levandowski rephrased one person's question. "We want to make sure that what we need to drive is present in everyone's vehicle and sharing information between them could happen, but it's not a priority."
What the car company's people couldn't or didn't want to understand was that Google does believe in vehicle-to-vehicle communication, but serially over time, not simultaneously in real-time.
After all, every vehicle's data is being incorporated into the maps. That information "helps them cheat, effectively," Levandowski said. With the map data—or as we might call it, experience—all the cars need is their precise position on a super accurate map, and they can save all that parsing and computation (and vehicle to vehicle communication).
There's a fascinating parallel between what Google's self-driving cars are doing and what the Andreesen Horowitz-backed startup Anki is doing with its toy car racing game. When you buy Anki Drive, they sell you a track on which the cars race, which has positioning data embedded. The track is the physical manifestation of a virtual racing map.
Last year, Anki CEO (and like Urmson, a Carnegie Mellon robotics guy) Boris Sofman told me knowing the racing environment in advance allows them to more easily sync the state of the virtual world in which their software is running with the physical world in which the cars are driving.
"We are able to turn the physical world into a virtual world," Sofman said. "We can take all these physical characters and abstract away everything physical about them and treat them as if they were virtual characters in a video game on the phone."
Of course, when there are bicyclists and bad drivers involved, navigating the hybrid virtual-physical world of Mountain View is not easy: the cars still have to "race" around the track, plotting trajectories and avoiding accidents.
The Google cars are not dumb machines. They have their own set of sensors: radar, a laser spinning atop the Lexus SUV, and a suite of cameras. And they have some processing on board to figure out what routes to take and avoid collisions.
This is a hard problem, but Google is doing the computation with what Levandowski described at Nissan as a "desktop" level system. (The big computation and data processing are done by the teams back at Google's server farms.)
What that on-board computer does first is integrate the sensor data. It takes the data from the laser and the cameras and integrates them into a view of the world, which it then uses to orient itself (with the rough guidance of GPS) in virtual Mountain View. "We can align what we're seeing to what's stored on the map. That allows us to very accurately—within a few centimeters—position ourselves on the map," said Dmitri Dolgov, the self-driving car team's software lead. "Once we know where we are, all that wonderful information encoded in our maps about the geometry and semantics of the roads becomes available to the car."
Once they know where they are in space, the cars can do the work of watching for and modeling the behavior of dynamic objects like other cars, bicycles, and pedestrians.
Here, we see another Google approach. Dolgov's team uses machine learning algorithms to create models of other people on the road. Every single mile of driving is logged, and that data fed into computers that classify how different types of objects act in all these different situations. While some driver behavior could be hardcoded in ("When the lights turn green, cars go"), they don't exclusively program that logic, but learn it from actual driver behavior.
In the way that we know that a car pulling up behind a stopped garbage truck is probably going to change lanes to get around it, having been built with 700,000 miles of driving data has helped the Google algorithm to understand that the car is likely to do such a thing.
Most driving situations are not hard to comprehend, but what about the tough ones or the unexpected ones? In Google's current process, a human driver would take control, and (so far) safely guide the car. But fascinatingly, in the circumstances when a human driver has to take over, what the Google car would have done is also recorded, so that engineers can test what would have happened in extreme circumstances without endangering the public.
So, each Google car is carrying around both the literal products of previous drives—the imagery and data captured from crawling the physical world—as well as the computed outputs of those drives, which are the models for how other drivers might behave.
There is, at least in an analogical sense, a connection between how the Google cars work and how our own brains do. We think about the way we see as accepting sensory input and acting accordingly. Really, our brains are making predictions all the time, which guide our perception. The actual sensory input—the light falling on retinal cells—is secondary to the prior experience that we've built into our brains through years of experience being in the world.
That Google's self-driving cars are using these principles is not surprising. That they are having so much success doing so is.
Peter Norvig, the head of AI at Google, and two of his colleagues coined the phrase "the unreasonable effectiveness of data" in an essay to describe the effect of huge amounts of data on very difficult artificial intelligence problems. And that is exactly what we're seeing here. A kind of Googley mantra concludes the Norvig essay: "Now go out and gather some data, and see what it can do."
Even if it means continuously and neverendingly driving 4 million miles of roads with the most sophisticated cars on Earth and then hand-massaging that data—they'll do it.
That's the unreasonable effectiveness of Google.
Driverless City
The streets of a new neighborhood on the edge of Ann Arbor, Michigan, seem remarkably clean and peaceful. For automated cars, however, they represent daunting challenges.
The University of Michigan opened the area, called Mcity, this week, as a place for automotive companies and suppliers to test technology for automated and connected driving. The facility was built in collaboration with the Michigan Department of Transportation and with funding from numerous automakers and suppliers, who will have access to the area to test their technology in different situations.
A mocked-up set of busy streets in Ann Arbor, Michigan, will provide the sternest test yet for self-driving cars. Complex intersections, confusing lane markings, and busy construction crews will be used to gauge the aptitude of the latest automotive sensors and driving algorithms; mechanical pedestrians will even leap into the road from between parked cars so researchers can see if they trip up onboard safety systems.
The urban setting will be used to create situations that automated driving systems have struggled with, such as subtle driver-pedestrian interactions, unusual road surfaces, tunnels, and tree canopies, which can confuse sensors and obscure GPS signals.
“If you go out on the public streets you come up against rare events that are very challenging for sensors,” says Peter Sweatman, director of the University of Michigan’s Mobility Transformation Center, which is overseeing the project. “Having identified challenging scenarios, we need to re-create them in a highly repeatable way. We don’t want to be just driving around the public roads.”
Google and others have been driving automated cars around public roads for several years, albeit with a human ready to take the wheel if necessary. Most automated vehicles use accurate digital maps and satellite positioning, together with a suite of different sensors, to navigate safely.
Highway driving, which is less complex than city driving, has proved easy enough for self-driving cars, but busy downtown streets—where cars and pedestrians jockey for space and behave in confusing and surprising ways—are more problematic.
“I think it’s a great idea,” says John Leonard, a professor at MIT who led the development of a self-driving vehicle for a challenge run by DARPA in 2007. “It is important for us to try to collect statistically meaningful data about the performance of self-driving cars. Repeated operations—even in a small-scale environment—can yield valuable data sets for testing and evaluating new algorithms.”
Mcity is meant to include just about every possible traffic situation that could cause trouble for an automated car’s sensors and algorithms. Automated driving technology is improving, but computers still struggle with certain situations where visibility is degraded, where scenes are extremely complicated, or where there is a lot of other traffic that must be negotiated. The facility could also provide a good place for regulators to certify that different automated driving technologies meet the required standards (assuming such standards are established).
Across 32 acres of urban and suburban sprawl, Mcity includes traffic circles, tunnels, construction sites, highway on- and off-ramps, even faded lane markings and traffic signs defaced with graffiti. The snow that visits Michigan in winter should provide another important challenge.
This week, companies with a stake in Mcity were there to showcase their technologies and give visitors a ride around the facility. For example, the mapping company Here, a division of Nokia, showed one of the cars it uses to scan streets with high-resolution cameras and laser ranging instruments. This data is valuable to automotive companies because it can help vehicles navigate along streets (recent reports, in fact, suggest Here may soon be acquired by a consortium of German carmakers). Here has scanned Mcity and made the data available to other companies.
The main street in Mcity consists of stores and restaurants painted along a wall.
Several automotive technology companies were at the launch. Denso, a Japanese component supplier, showed off hardware that allows cars to communicate with one another to help prevent accidents and congestion (see “10 Breakthrough Technologies: Car-to-Car Communications”).
The German supplier Bosch demonstrated a system that allows cars to drive for themselves on straight stretches of road. And Xerox and Honda showed a system that allows cars to detect and communicate with pedestrians or cyclists, via a suitably modified smartphone, to help prevent accidents.
Lorraine Novak, an application developer at Denso who also works on automated driving systems, said she was very keen to use Mcity to try different approaches to particular problems. She noted that traffic circles remain a big problem for automated cars. In busy traffic, self-driving cars, which are usually configured to be quite cautious, can end up going around a traffic circle many times before escaping, or they may simply come to a standstill, she says.