Tesla’s Director of Artificial Intelligence and Computer Vision is training Autopilot to allow faster and more accurate image detection.


Tesla moves toward full self-driving

For over three years now, all new Tesla models have included cameras, along with radar and ultrasonic proximity sensing, to provide more than just driving assistance. What the company terms full self-driving has been slowly evolving with incremental improvements.

Tesla’s approach is based on advanced artificial intelligence for vision and vehicle movement planning, supported by efficient use of their on-board inference hardware. The company believes this is the only way to achieve a general solution to full self-driving, and coupled with GPS and map data, it has seen impressive results.

Karpathy offers insights

In mid-April, Andrej Karpathy, Director of Artificial Intelligence and Computer Vision on Tesla’s Autopilot Team, spoke at the 5th Annual Scaled Machine Learning Conference 2020. In his talk, Karpathy explains how Tesla trains Autopilot with its customer-collected data feed using long tail examples such as stop signs painted on buildings or occluded by tree branches. To improve Autopilot’s reaction in these instances, Tesla uses an over-the-air software detector sent to each of its 800,000+ roving vehicles to know whether or not owners have purchased the software. This detector, for example, can identify images of occluded stop signs.

In contrast, GM’s Cruise Automation and Waymo have limited access to such data for their fleets that operate in the hundreds in several cities instead of Tesla’s hundreds of thousands operating across the country.

As described by Karpathy, the clever use of a growing fleet to gather data bodes for an impressive head start. These cars sense and are robots networked together to gather data from the remote corners of wherever the car takes its driver. This entire new fleet already includes enabled active safety emergency braking capabilities to warn drivers in case of need.

Karpathy explains how Tesla bridges the gap between cameras and LiDAR, and plans a major foundational core system rewrite to allow the “neural net to absorb more and more of the problem.”  The rewrite also enables rapid updates, including the use of 3D video labeling (as opposed to 2D image labeling), which in turn allows faster, more accurate image detection and vehicle path planning.

In this new version, LiDAR will better recognize images, and assess direct depth of objects, with a higher level of accuracy than is achieved with cameras. In an effort to clarify the difference here, Karpathy explains that cameras use a two-step indirect process, first taking a shot of an image and then employing software to improve the outcome by gauging depth through   analyzing changes in the pixels. Mistakes involving a few pixels can translate into meters or yards of inaccuracy. By  labeling 3D videos of driving scenes, Tesla is compensating for the camera’s weakness as the primary image sensor. With the recent upgrades to on-board processing software, this can apparently be accomplished in a more timely fashion.

Local mapping with lower definition

Tesla uses far less detail than the high definition maps employed by its competitors. While a Waymo vehicle drives with preloaded information about the exact location of a stop sign, within centimeters of accuracy, a Tesla would detect only the presence of a stop sign somewhere in the vicinity. In addition to vague local maps and its camera-based approach, 3D video labeling separates Tesla from its competitors, enabling improved recognition of corner cases in solving for full autonomy.

This approach of using ‘curated unit test sets’ definitely “sets” Tesla apart (pun), making it nearly impossible for a competitor to replicate. While autonomous driving is an extremely complex problem to solve, Tesla could enjoy a near-monopoly in autonomous ride hailing if it is successful. The most recent rollout of stop sign and traffic light recognition has been seen as a significant improvement, yet drivers should still remain vigilant and cautious. Carelessness could easily bring the same results as Covid-19!

Even those lacking significant skills in software may want to check out Tesla’s recruiting webpage. There, one can view live images of what the car’s system “sees.”