2019年10月31日 星期四

The Deadly Consequences of Rounding Errors


Photo illustration by Lisa Larson-Walker. Photo by Getty Images.

“At 7-Eleven, you’d take a penny from the tray, right? Well those are whole pennies! I’m just talking about fractions of a penny here, OK? But we do it from a much bigger tray, and we do it a couple million times.” That’s how Peter Gibbons (Ron Livingstone) explains his penny-shaving scheme in Office Space. Penny-shaving is a plot point in Superman III, too—in both movies, characters make huge amounts of money by taking advantage of the fact that numbers need to be shortened to two decimal places in order to function as currency, in dollars and cents.

Computer code can handle this problem in many ways, including the command round(), which rounds to the nearest cent, or floor(), which just cuts off all the extra decimal places. For example, an after-tax salary computation of $145.459 would normally be rounded to $145.46; in a penny-shaving scheme, it would be floored to $145.45 instead. Such a small change is generally imperceptible to people in their paycheck, but when millions or billions of such fractional cents are stolen, it adds up.

Sometimes those fractional cents aren’t stolen—they simply vanish. In the early 1980s, a new stock index at the Vancouver Stock Exchange tracked a steady and mysterious loss in value. An investigation revealed that floor() was being used instead of round(), with the lost fractions of cents accumulating to almost a 50 percent loss of value in 22 months. The programming mistake was finally fixed; the index closed around 500 on a Friday and reopened the following Monday at over 1,000, the lost value restored.

The impact of rounding errors extends far beyond cash transactions and penny-shaving theft. Since nearly every modern computer is digital (as opposed to analog), numbers must have discrete representations. This requires quantization: a mapping of numbers from infinite precision to finite precision, as in the example above of calculating after-tax salary. It can be a command like round(); floor(); ceiling(), which pushes to the next integer; or something even fancier as in audio coding that scales sound volume logarithmically before rounding, to better match human perception.

All of these quantization adjustments take place in the miniscule thousandths and ten-thousandths—and deeper—decimal places, but the cumulative scrapings of penny-shavings are just a hint of the repercussions. These seemingly insignificant adjustments can have huge effects in areas such as missile defense, political elections, and spaceflight, as Pete Stewart, now a distinguished university professor emeritus of the University of Maryland’s Institute for Advanced Computer Studies, compiled in a 1999 email to a listserv.

For instance, in late February 1991 during the Gulf War, an Iraqi scud missile hit American barracks in Dhahran, Saudi Arabia, killing 28 soldiers and wounding 260. This one missile accounted for more than one-third of all U.S. fatalities during the war. Although the U.S. military had deployed a Patriot defense system, it failed to fire and let the scud through. The basic problem was in quantizing the factor used to convert the timing variable of an internal clock (represented as an integer, in tenths of a second). When a tenth of a second is represented in binary numbers, it is repeating (like 1/6 is repeating in decimal as 0.16666… with the sixes going on forever), specifically it is 0.00011001100110011 and so on, with the 0011 occurring over and over. When this is truncated to 24 bits—that is, the number of places that can be used in the representation—there is an error of 0.000000095. (That’s the quantization error written as a decimal number rather than as a binary number, which would repeat forever.)

It’s a vanishingly small number, unless you are trying to lock onto a moving object in the sky. The Patriot tracking algorithm estimated the position of the target using this slightly-off timing variable and the velocity of the incoming missile. The end result was an error of 573 meters. The Patriot system thought the scud was far from the barracks and did not fire.

Another fateful rounding error occurred in June 1996, when the European Space Agency’s uncrewed Ariane 5 rocket exploded just 39 seconds after liftoff. After a decade of development costing $7 billion, this maiden voyage scattered its destroyed payload of four uninsured scientific satellites across mangrove swamps, resulting in the loss of $500 million worth of equipment. The rocket had disintegrated when it made an abrupt course correction to compensate for a wrong turn—a turn it had not, in fact, taken. This confusion was caused by a rounding error in the inertial reference code that was being reused from the Ariane 4 rocket. A 64-bit floating point number representing the horizontal velocity of the rocket with respect to the platform was converted to a 16-bit signed integer, so there were 48 fewer number places. But that was a mistake because Ariane 5 was much faster than Ariane 4. The conversion yielded a number that was larger than the largest possible number in that quantized representation from the slower Ariane 4. This caused an error message to be transmitted, which was then interpreted as a nonsensical number by the on-board computer—and disaster.

A nonexplosive example comes from Germany, where parliamentary elections have complicated rules such that parties must surpass a minimum 5-percent threshold in order to receive seats in the legislature. In the April 1992 elections in the state of Schleswig-Holstein, it seemed like the Green Party had received exactly 5 percent of the votes. After the election results were already published, however, someone discovered that the Greens had received only 4.97 percent of the vote. The software that printed out the percentages had used round(), which bumped the count up to 5 percent, rather than truncating with floor(), which would have been in line with the rule. The software had been used for years and no one had previously noticed, though it’s unclear whether that would have made a difference in earlier elections. After the correction, the 4.97 percent of votes corresponding to the Greens were thrown out according to the rule, the allocation of seats was recalculated, and the Social Democrats received one more spot, thereby gaining a one-seat majority in the Bundestag.

Representing and using numbers in a computer may seem like a trivial programming task, but confusing round() and floor() can fundamentally change political, scientific, and financial outcomes. When different kinds of number representations are used together in complex engineering systems such as rockets and missiles, they can fail catastrophically and even cost lives. As digital computers are integrated into larger and larger sociotechnical and physical systems, we must be even more careful. Imperceptible quantization errors could amplify and cause cholera epidemics due to improper wastewater treatment, massive blackouts due to imprecise control of the power grid, or even large-scale war due to misperception of enemy actions. When almost everything in society is represented digitally, almost everything in society is vulnerable to seemingly tiny errors.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.



from Slate Magazine https://ift.tt/2q4a1TJ
via IFTTT

沒有留言:

張貼留言