False Prophets, False Predictors of Failure, and the Brave New World of Code Ghosts

blackswan
The Black Swan, symbol of low-probability events, thanks to Nick Taleb’s book of the same name

A short while ago there was a rash of alleged autocruise accidents with Teslas, where the owners claimed their autodrive cars had driven themselves into the sides of buildings or off the Pennsylvania Turnpike, a fundamental failing of the car. These stories are hot news when they come out. There are YouTube videos, and blogs, and the more technical web journals such as Ars Technica and Slate covered these incidents. After the 24-hour news cycle, a week or four later, the facts come out and show who pushed what buttons to make the car do what it did. From the perspective of “the autopilot autonomously drove my car into a wall,” those claims were shown to be falsehoods engineered to cover up careless behavior by the owner.
There is a truth here that the autopark interface is not foolproof, and that the autopilot must be human-monitored. If the human tells the car to go until it hits a wall, the car can do that. But there are also claims when the owner claimed the autopilot was driving, and two weeks later the “tale of the tape” shows the autopilot was not on.
There is another truth here, that it is ridiculously difficult to access the factual, technical details of HOW the Tesla works. The young salesforce who provide customer interface for people who buy Teslas are poorly informed, or sleazy with the truth, or both. The Tesla appeals to technical people and engineers, so many buyers know better when the salesman feeds them a line that is patently false, such as “the axles of the car turn backward to effect regenerative braking.” Even as an owner it is really difficult to find factual information about how the Tesla actually works. The car’s manual is loaded on software inside the car, which sounds handy, but you have to sit inside the car to look something up. It is not available through the owner’s web account, through the owner’s smartphone point of access, on Tesla’s website, on the Tesla owner’s association website, nor is it in a printed booklet in the glovebox. A colleague asked me if the Tesla autopilot uses GPS in lane-keeping, and I responded negatively, because the release notes didn’t mention GPS, and it appears that lane-keeping works when the GPS signal is down, but in fact I don’t actually know it to be a fact. There are good reasons for such arrangements, such as the owner’s manual has to be updated every time there is a software release, and if the manual were on the web it could be printed and then the technology backwards-engineered by competitors. But another outcome is that people can hear that the Tesla does this or that, believe it, and not be able to fact-check the statement. A sales rep once assured me that the Tesla would autobrake if aimed into a wall but then declined to demonstrate it, saying that capability was “turned off” for test drives.
As far as the autopark incidents, the statistician in me did a quick fraud check when they came out. When you hear about a single accident being caused by a piece of automation, followed up by a second, and a third, within a short time span, following millions and millions or miles of accident-free history, you can be pretty confident the claim is a fraud. One person makes a fraudulent claim, for money, fame, or attention, like saying the syringe was in the Pepsi can; then a few people with similar needs copycat the claim. The autopark incidents sounded a bit like that; three were raised in just six months as Tesla rounded the 700,000 vehicles sold mark. Tesla autocruise and autopark were introduced in fall of 2015, so there had only been a year of use. Across approximately 600,000-700,000 vehicles sold, at a conservative 10,000 miles per year, there could be 6 to 7 million miles of experience. All the vehicles are manufactured the same way. If there were a flaw in the code or manufacturing, in 6,000,000 identical use miles, the flaw should have shown up more than 3 times.
The same thing has come up before, with the Prius incidents. Toyota is a prominent, well run, well-funded company, with a robust and diversified product line. The Prius had been on the market for 10 years when the unintended accelerations allegedly occurred. There were 10 million Priuses on the road. They were being accused of a systematic failure that occurred across all model years. There were less than a dozen claimed cases so unintended acceleration. It was statistically impossible for the unintended acceleration to be a flaw in all 10 million Priuses, given the mileage and the high numbers of Prius on the road. Therefore, the accusations would likely prove false in time. This reasoning may seem obvious, facetious and trivial, but at the time of the accusations, enough people were worried that the stock price of Toyota fell to $47/ share from a usual perch of $79/share. The U.S. Department of Transportation even convened a committee to investigate the allegations.

There were two outcomes of the Prius unintended acceleration allegations. One proved the claims were false, and one proved they were true.

The unintended accelerations in the Prius were by and large shown to be caused by driver error, a classic one, that of stomping on the accelerator when the driver means to be stomping on the brake. Prius was cleared of unintended accelerations in 2009. See the 2009 Green Car Reports. <old post: http://www.greencarreports.com/news/1020362_prius-sudden-acceleration-much-ado-about-nothing> By classic I mean that drivers are human, and as human we are imperfect, and we make mistakes. Sometimes we make the same mistake over and over. As engineers, we then turn our attention to improving the situational awareness for the driver, or making the acceleration interface more distinctive, to make this repeated error less likely.

However, in 2014, Lexus automobiles began also to show unintended acceleration, which Toyota initially blamed on non-standard floor mats. Like the Prius, there were a few of these, and the news coverage was sensational. The widespread news coverage was instrumental in uncovering the truth, as a driver (Kevin Haggerty) who had watched an ABC news report showing how to get your car under control in such a situation (“sticky pedal”) was able to follow the guidance and drive his mal-performing vehicle to the dealership, where the experts had to agree it was not the floor mat.<http://abcnews.go.com/Blotter/toyota-pay-12b-hiding-deadly-unintended-acceleration/story?id=22972214>

There is a reason it is easy to drive fast these days compared to the 1960s and 1970s, and it is electronic control of the accelerator, an innovation akin to power steering, but for the accelerator pedal. Drive-by-foot speeds suffer from unintended speed variation when the vehicle goes over a bump. The bump jostles your foot off the accelerator pedal, and the car abruptly slows, then jerks forward again as you recover. To smooth the ride, electronic controls were placed around the accelerator pedal to sense the amount of pressure the driver used to propel the vehicle. If the amount of pressure varied very slightly for just a microsecond, a smoothing algorithm prevented that change in input from becoming a change in velocity of the car. To do this, the car has an actuator associated with downward pressure, and an actuator that provides a spring lift to the accelerator, regulating the lessening of velocity. The two actuators send signals to an electronic board that uses the combined inputs to figure out what to do. It is straightforward to imagine that if a driver wants to jump off the starting line, the algorithm would be trained to apply non-skid and efficient acceleration to the wheels when the pedal is depressed hard and quickly.

A problem with the Lexus was that there was a logic buffer that recorded the bits of input from the accelerator pedal sensors, and this buffer was not properly shielded from electromagnetic interference. A solar storm could provide enough electromagnetic force to cause a bit to flip in the array and cause unstoppable, unintended acceleration. A piece of electronics vulnerable to solar electromagnetic force would also be vulnerable to the energy from a microwave oven, and perhaps a cell phone, perhaps high power lines overhead. <http://spectrum.ieee.org/riskfactor/computing/it/toyotas-sudden-unintended-acceleration-caused-in-part-by-electronic-interference> The presence of this fault was known at Toyota but evidently regarded as not a threat, or of low probability, so millions and millions of cars manufactured by Toyota were caught under the recall, including a few Priuses.

Software experts Michael Barr and Phillip Koopman testified in a case involving a fatality stemming from unintended acceleration in a Camry that the software and engineering involved presented multiple opportunities for “bit flips” from EMI, corruptions, and even unintended bit outcomes from the software itself. It has proven difficult to identify the exact culprit, meaning a particular line of code or a particular buffer stack that caused a specific accident and fatality. The Haggerty vehicle was not accelerating due to “sticky pedal” defect but due to something else, but the news story on “sticky pedal” provided a solution.

So the summary of this article is this: it used to be easy to get an indicator of falsehood in accident claims, merely by looking at the statistical incidence of the accident cause over the population of vulnerable vehicles. However, as we move into autonomous and semi-autonomous cars, particularly where the manufacturer obfuscates the logic, it will be increasingly difficult to prevent low-probability errors in code, in buffers, in the failure to shield electronics from EMI. Thus it will be increasingly difficult to determine whether a logic error in a new product is the result of a less than upfront testimonial or a product of a ghost in the code.