The Vanishing Precision of Floating Point Numbers

Floating points are fun. Due to the trade-offs they make between range and precision, they are full of odd quirks.

One of my favourite quirk thats (relatively) less popular is how as the numbers get larger, the gaps between representable values grow. For example, between 1 and 2, there are over 4.5 trillion representable 64 bit representable values, but between 1,073,741,824 and 1,073,741,825, there are only 512 representable values.

This leads to some rather surprising results. Here are some of my favorites:

Adding 1 (and in some cases even a million) after a point doesn’t change the number at all

func main() {  
	var largeNumber float64 = 10_000_000_000_000_000.0  
  
	fmt.Printf("A Number: %.5f\n", largeNumber)  
	fmt.Printf("After adding 1: %.5f\n", largeNumber+1)  
	fmt.Printf("Is itself: %t\n", largeNumber == largeNumber+1)  
} 

// Result:
// A Number: 10000000000000000.00000
// After adding 1: 10000000000000000.00000
// Is itself: true

Beyond 9,007,199,254,740,992 , floating points cannot be used to represent odd integers anymore

func main() {  
    var largeNumber float64 = 9_007_199_254_740_992.0  
  
    fmt.Printf("A large even number: %.5f\n", largeNumber)  
    fmt.Printf("next odd number: %.5f\n", largeNumber+1)  
    fmt.Printf("next even number: %.5f\n", largeNumber+2)  
    fmt.Printf("next next odd number: %.5f\n", largeNumber+5)  
}

// Result:
// A large even number: 9007199254740992.00000
// next odd number: 9007199254740992.00000
// next even number: 9007199254740994.00000
// next next odd number: 9007199254740996.00000

The gap at quadrillion is ~0.000122. Any decimal smaller than half that gets swallowed whole.

func main() {  
    var largeNumber float64 = 1_000_000_000_000_000.0  
    var smallNumber float64 = 0.000061  
  
    fmt.Printf("Large number: %.8f\n", largeNumber)  
    fmt.Printf("Small number: %.8f\n", smallNumber)  
  
    fmt.Printf("Sum: %.8f\n", largeNumber+smallNumber)  
}

// Result:
// Large number: 1000000000000000.00000000
// Small number: 0.00006100
// Sum: 1000000000000000.00000000

Here’s a very brief refresher on why this happens

IEEE 754 double-precision floating point numbers allocate:

1 bit for sign
11 bits for exponent
52 bits for mantissa (the significant digits)

The value is calculated as: $(-1)^{s} \times (1 + mantissa) \times 2^{(exponent - 1023)}$

Since the exponent determines the spacing between representable numbers, as the exponent increases, the minimum distance between adjacent numbers grows exponentially.

Abhilash Meesala

The Vanishing Precision of Floating Point Numbers

Links

Socials