How Self-Driving Cars Work
A 8-minute read
A self-driving car processes more data every second than a human brain handles in a day. And it still struggles with a plastic bag.
A licensed human driver can pass the DMV road test after about 45 hours of practice. Waymo’s vehicles, by contrast, had driven more than 25 million real-world miles by the end of 2024, per a Waymo/Swiss Re safety study, and simulate billions more virtually, and they still encounter situations that require careful handling Waymo. The gap between “drive around a city block” and “handle everything a human driver handles” turns out to be almost impossibly wide. Understanding why reveals something surprising about how intelligence itself works.
The short answer
A self-driving car works by combining sensors that see the world, software that understands what it sees, and planning systems that decide what to do next. The car takes in data from cameras, lasers, and radar, builds a model of its surroundings, predicts what other road users will do, and plans a safe path forward.
Sounds simple. It is not.
The full picture
The sensor stack: what the car can see
A fully equipped autonomous vehicle typically carries five types of sensors. Each one fills in gaps that the others leave.
Cameras are the most familiar. They work like human eyes, capturing visual information in color. Modern cars use 8 to 12 cameras positioned around the vehicle to see in every direction. Cameras excel at reading signs, detecting traffic lights, and identifying objects by shape and color. But they struggle in fog, heavy rain, or direct sunlight, and they cannot measure distance directly.
LiDAR (Light Detection and Ranging) shoots out millions of laser pulses every second and measures how long they take to bounce back. This creates a 3D point cloud, essentially a detailed map of everything around the car. LiDAR is excellent at measuring distances precisely and works well in darkness. The downside is cost (traditional LiDAR units once cost tens of thousands of dollars, though prices have dropped dramatically) and it struggles with rain, snow, or dust that scatter laser pulses.
Radar uses radio waves to detect objects and their speed. It has been in cars for decades as part of adaptive cruise control. Radar is robust in bad weather and can measure velocity directly, but it has lower resolution than LiDAR and cannot distinguish between a truck and a signpost as easily.
Ultrasonic sensors are the cheap little sensors used for parking assistance. They detect nearby objects using sound waves and work reliably in all weather, but their range is limited to just a few meters.
Microphones are an emerging addition, used to detect emergency vehicle sirens and hear horns or screaming pedestrians.
No single sensor does everything. The magic is in sensor fusion, combining all these inputs into a coherent picture. A plastic bag looks like a solid obstacle to LiDAR, which is why some cars brake erratically for harmless objects. A camera might correctly identify it as a bag, but fusion algorithms must decide which sensor to trust.
Perception: knowing what is out there
Sensors give raw data. Perception turns that data into understanding. This is where machine learning shines and where the hardest problems hide.
The car must do three things simultaneously. First, it must detect objects, drawing bounding boxes around cars, pedestrians, cyclists, and obstacles. Second, it must classify those objects, identifying a stopped vehicle as a car versus a delivery truck. Third, it must track objects over time, following a pedestrian as they move across the road.
Deep learning models handle most of this work. Neural networks trained on millions of labeled images can identify objects with remarkable accuracy, often better than human drivers in controlled conditions. But perception systems can fail in unexpected ways. Research published at IEEE security conferences has found that adding small stickers to a stop sign could cause some systems to misread it as a speed limit sign. These are called adversarial attacks, and they reveal how fragile machine learning can be.
The real challenge is not the obvious objects. It is the unusual ones. A mattress fallen off a truck, a person pushing a bicycle, a child chasing a ball. These edge cases are what keep engineers up at night.
HD maps versus mapless driving
Before a car can drive itself, it needs to know where it is. High-definition (HD) maps are incredibly detailed digital models of roads, lanes, signs, and infrastructure. They include precise lane markings, the exact location of curbs, and even the shapes of nearby buildings.
Waymo, the Alphabet (Google)-owned autonomous vehicle company, relies heavily on HD maps. Their cars map every road they plan to drive on in advance, down to centimeter accuracy. When the car drives, it compares what its sensors see to the map, confirming its position. This approach works well in mapped areas but requires constant updates as roads change.
Tesla takes a different path. Its mapless approach relies almost entirely on what the car can see in real time. No pre-mapped roads, no reliance on infrastructure updates. The argument is that a truly autonomous car should not need hand-crafted maps. It should drive like a human, using only what it sees. Critics say this makes the problem much harder than it needs to be.
Both approaches have merit. HD maps provide a safety net, a rich source of prior knowledge. Mapless driving is more flexible but demands far more from the perception system.
Prediction: what will happen next
A self-driving car cannot just react to where other road users are. It must predict where they are going. These prediction systems use the same generative modeling principles found in AI image generation, learning statistical patterns from vast datasets to predict plausible futures. This is called behavior prediction, and it is one of the hardest problems in robotics.
A pedestrian standing on the curb might be waiting to cross, might be checking their phone, or might be waiting for a bus. A car approaching an intersection might turn, might go straight, might stop. The car must assign probabilities to each possibility and update those probabilities as new information arrives.
Modern prediction systems use machine learning to model human behavior. They learn from vast datasets of how people actually drive and walk. But humans are unpredictable. A driver might suddenly change lanes without signaling. A cyclist might wobble unpredictably. The car must plan for multiple futures, not just the most likely one.
Planning: deciding what to do
Once the car understands the world and predicts what others will do, it must plan its own actions. This happens at multiple levels.
Motion planning figures out the specific path the car will follow. Given the current position, the destination, and the surrounding traffic, what trajectory should the car follow? This involves math more complex than most people realize, optimizing for safety, comfort, and efficiency simultaneously.
Control is the lowest level. Once a trajectory is chosen, the car must actually steer, accelerate, and brake to follow it. This is relatively well-understood and has been in production cars for years as lane-keeping assist and adaptive cruise control.
The harder question is decision making in tricky situations. Should the car yield to a car that is technically violating the rules? Should it nudge forward into an intersection to signal intent? These are judgment calls that human drivers make instinctively, and teaching a machine to make them is extraordinarily difficult.
The SAE levels: what they mean
When people talk about self-driving cars, they often reference SAE levels. The Society of Automotive Engineers defined a scale from 0 to 5 in its J3016 standard (most recently updated in 2021) that describes how much the car is driving itself SAE International.
Level 0 is no automation. The human does everything, even if the car has warning systems.
Level 1 is driver assistance. The car can control either steering or speed, but not both. Lane-keeping assist or adaptive cruise control are Level 1.
Level 2 is partial automation. The car can control both steering and speed simultaneously, but the human must stay engaged and ready to take over at any moment. Tesla’s Autopilot, GM’s Super Cruise, and most similar systems are Level 2. The human is legally responsible.
Level 3 is conditional automation. The car can handle most driving tasks on its own, but the human must be ready to intervene when asked. The car will warn the driver with enough time to take over. Mercedes-Benz has Level 3 systems approved in Germany and some US states under its DRIVE PILOT system, but only on highways under specific conditions.
Level 4 is high automation. The car can drive itself in defined conditions without any human input. If something goes wrong, the car handles it itself. No human intervention is needed. But Level 4 is usually geofenced, meaning it only works in mapped areas under favorable conditions. Waymo’s robotaxis in Phoenix, San Francisco, and Austin are Level 4.
Level 5 is full automation. No geographic limits, no conditions. The car can drive anywhere a human can. No Level 5 system exists yet, and many experts believe it is decades away.
The jump from Level 3 to Level 4 is enormous. At Level 3, the human is a fallback, ready to take over when the car gets confused. At Level 4, the car must handle everything itself. That means Level 4 systems need to be far more robust, capable of handling a much wider range of situations without asking for help.
The Waymo versus Tesla debate
The two most visible approaches to autonomous driving come from Waymo and Tesla, and they represent fundamentally different philosophies.
Waymo uses a full sensor suite including LiDAR, relies on pre-mapped areas, and limits its cars to operating in cities where it has extensive data. By the end of 2024, Waymo had completed over 4 million fully driverless rides. The approach is conservative and expensive, but the safety record is impressive.
Tesla dismisses LiDAR as unnecessary and expensive, relying on cameras and neural networks alone. It has millions of cars on the road collecting data, giving it an unparalleled dataset for training its systems. But its Autopilot and Full Self-Driving (FSD) systems have been subject to scrutiny from the National Highway Traffic Safety Administration (NHTSA) over crash investigations, raising questions about whether a camera-only approach can ever be safe enough for all conditions.
The debate is not settled. LiDAR costs have dropped dramatically, and most experts believe some form of redundancy is essential. Tesla’s approach might work for highway driving, where conditions are simpler, but city streets present a far harder challenge.
The long tail problem
A self-driving car might drive perfectly for 10,000 miles. Then it encounters a construction zone with a temporary traffic light that looks different from any it has seen before. Or a deer standing in the road. Or a person in a wheelchair chasing a dog.
These are edge cases, and they are the bottleneck. The car has seen millions of cars and pedestrians, but it has seen relatively few of the weird situations that cause accidents. Collecting enough data to handle all possible edge cases is extraordinarily difficult.
This is called the long tail problem. Most driving is routine, but the last few percent of unusual situations require disproportionately more data and engineering effort. A car that is 99.9% reliable might still cause accidents regularly, because that 0.1% of unusual situations happens frequently enough when you drive millions of miles.
Why 99.9999% is not good enough
According to the NHTSA, the human fatality rate in the US was approximately 1.20 deaths per 100 million vehicle miles traveled in 2024, the lowest recorded rate since 2019 NHTSA. To be considered genuinely safe, autonomous vehicles must match or exceed that number across the full range of conditions.
The math is brutal. The US alone sees roughly three trillion miles driven per year. A system that is 99.9% reliable would still fail three billion times annually. Industry engineers often talk about needing to approach “nine nines” of reliability for critical subsystems, failure rates of less than one in a billion.
No current system comes close to achieving this across all scenarios, though Waymo’s urban robotaxi fleet has logged millions of driverless miles with a safety record that compares favorably to human drivers in similar urban conditions, per the company’s own published safety reports.
This is why progress has been slower than many expected. The easy stuff was done years ago. What remains is the hard stuff, the edge cases, the long tail.
Why it matters
Self-driving cars are not just a technology problem. They are a social transformation waiting to happen.
If autonomous vehicles become truly safe and widespread, they could reshape cities. Parking garages could become parks. Elderly and disabled people could gain mobility independence. Delivery costs could plummet. Millions of driving jobs would disappear.
We are not there yet. But every month, the technology gets better. The question is not whether self-driving cars will work, but when, and what trade-offs we are willing to accept along the way.
The plastic bag problem is not a bug. It is a symptom of a fundamental challenge: teaching machines to navigate a world designed for humans. That turns out to be one of the hardest problems we have ever tried to solve.
The liability question: who’s responsible when an autonomous car crashes?
Technology problems and legal problems move at very different speeds. Autonomous vehicle technology has advanced faster than the legal framework needed to handle it, and the gap is creating real complications.
When a human driver causes an accident, the liability framework is clear: the driver is responsible, backed by their insurance policy. When a Waymo robotaxi causes an accident with no human driver on board, the question becomes genuinely complicated. Is it the manufacturer? The fleet operator? The software company that built the perception system? The company that created the training data?
Current US law still largely treats autonomous vehicles as if a human were in control, which creates awkward fictions. In most states, a “driver” must be present and capable of taking control, which is why many autonomous vehicle operators keep a human safety driver in the car during testing. But as systems become genuinely driverless, this framework breaks down.
Insurance models are also under pressure. Traditional auto insurance prices risk based on individual driver behavior. In a world of AV fleets operated by corporations, insurance looks more like product liability: the same model that applies to defective toasters or pharmaceutical side effects. This shifts risk from individuals to manufacturers, which changes incentives dramatically: if a car company is directly liable for every accident its vehicles cause, it has a very strong financial incentive to solve safety before deploying at scale. Some researchers argue this liability model would produce faster, more careful safety progress than the current regulatory approach.
The first fully autonomous vehicle manslaughter case is likely coming. What the legal system decides about it will shape the development of this technology as much as any engineering breakthrough.
Common misconceptions
Self-driving cars are already here. They are not. No commercially available car can drive itself anywhere without human supervision. Robotaxis exist in limited areas, but they are not the general-purpose vehicles that the term implies.
LiDAR is the only way. It is one way. Tesla argues cameras are sufficient and that LiDAR is a crutch that masks inferior software. The debate continues, but most autonomous vehicle developers now use some combination of sensors including LiDAR.
The main problem is technical. The technical problems are enormous, but there are also regulatory, legal, and social challenges. Who is liable when an autonomous car crashes? How should insurance work? How do we ensure the technology is equitable?
Once it works, it will work everywhere. Most Level 4 systems are geofenced to specific areas with favorable conditions. Snow, heavy rain, dirt roads, and unmapped areas remain challenges. A car that drives in Phoenix might be helpless in Mumbai.