← writings

Rounding Numbers: Are you doing it wrong?

Navigating Rounding Bias in Computer Science: Choosing the Optimal Rounding Method for Your Needs

Rounding numbers is a common practice that helps us simplify and make sense of numerical values. However, have you ever considered the potential bias introduced by traditional rounding methods? In this blog post, we will explore the concept of rounding bias and delve into Python 3's approach, known as “round half to even” or “banker's rounding,” which seeks to address this bias.

Rounding numbers is not as straightforward as it may seem. The traditional approach most of us are familiar with involves rounding up, where any fractional part greater than or equal to 0.5 is rounded up to the next whole number. While this method may seem intuitive, it introduces a bias towards higher values, potentially leading to inaccuracies and skewed results in certain scenarios.

Consider a situation where you are tasked with calculating the average income of a group of individuals. If you always round up when dealing with decimal values, you may inadvertently inflate the average income, giving a distorted representation of the actual data. Rounding bias can have far-reaching consequences in various applications, from financial forecasting to scientific research.

Rounding numbers in computing

Another important application of rounding numbers arises from the fact that computers cannot precisely represent all real numbers. Think about it — we consider computers to be incredibly powerful and more capable of difficult calculations than humans, yet we often forget their limitations, and most computers use a finite representation system called floating-point arithmetic, which can only approximate real numbers within a limited range and precision. Humans have the ability to comprehend and reason about real numbers in a more abstract and intuitive manner compared to computers. We can perform mental calculations, estimate values, and compare real numbers based on our innate sense of quantity. Floating-point arithmetic is a numerical method used to represent real numbers with an approximation, using an integer called the significand, which has a fixed precision. This format allows a wide range of numbers to be displayed, but has limitations in terms of precision.

Here is an example: let's consider the number π (pi), which is an irrational number, which means it has an infinite number of decimal places and cannot be expressed as a finite fraction or a terminating decimal. Thus the number π (pi) cannot be represented exactly on a computer with finite precision. Computers use a finite number of bits to store and manipulate numbers, which means that they can only represent an approximation of π. When representing π in a computer, rounding or truncation is necessary to limit the number of decimal places or bits used to store the approximation. This introduces some degree of error, as the true value of π cannot be fully captured. The number of decimal places or bits used to represent π determines the level of accuracy in the approximation.

Ok, let's illustrate: the binary single-precision (32-bit) floating-point representation where p = 24, indicating that the significand consists of a sequence of 24 bits; you can think of p = 24 as saying: I want the number with a `precision` of 24 bits. E.g. look at the first 33 bits of the number π: 11001001000011111101101010100010001

You can get this output in Python by converting the decimal representation of π into a binary string:

# Python
        import math

        pi = math.pi
        print(pi) 
        > 3.141592653589793

        binary_pi = format(int(pi * (2**33)), '033b')
        print(binary_pi)
        > 11001001000011111101101010100010001
                        

You multiply π by 2 raised to the power of 33 to scale it up and convert it to an integer. Then, using the format() function, we convert the integer into a binary string with a width of 33 bits, indicated by the 033b format specifier.

Let's summarise everything so far. Due to the finite number of bits available, there are some real numbers that cannot be accurately represented in binary floating-point format. For example, numbers that have infinitely repeating decimal expansions, such as 1/3 (0.3333…), cannot be represented exactly. The computer can only store an approximation of these numbers. As a result, computations involving real numbers can sometimes suffer from round-off errors and precision limitations. It's important to be aware of these limitations when performing numerical calculations and consider appropriate techniques, such as using higher precision libraries or implementing error analysis and mitigation strategies, when necessary. In the following sections, we will explore an alternative approach to the most common rounding method.

Round Half to Even (Banker's Rounding)

“Round to Even”, also called “Banker’s Rounding” method is an approach that aims to mitigate the bias introduced by always rounding up, providing a fairer and more accurate representation of the data. And this is also the method Python 3’s round() function uses. So let’s see the rationale behind this method, its advantages, and how it handles cases where the number to be rounded falls exactly between two integers.

Python 3 introduces an innovative approach to rounding called “round half to even” or “banker’s rounding.” This method aims to address the bias introduced by always rounding up and provides a more balanced and statistically sound approach to rounding numbers.

The rationale behind “round half to even” is to minimize the overall rounding error and maintain fairness in rounding. Unlike always rounding up, this method considers the digits following the rounding position and rounds to the nearest even number when the fractional part is exactly 0.5. This approach ensures that half of the numbers are rounded up, while the other half is rounded down, resulting in a more balanced distribution. This choice helps maintain statistical fairness and minimizes the bias introduced by always rounding up.

To better understand Python 3’s rounding behaviour, let’s consider some code examples. In traditional rounding, the number 3.5 would be rounded up to 4, as it exceeds the threshold of 0.5. However, in Python 3’s “round half to even” method, the number 3.5 is rounded down to 3 since both 3 and 4 are equidistant from it, and the rule favours rounding to the nearest even number. Here’s an example demonstrating Python 3’s rounding method:

# Python
        # Rounding using Python 3's "round half to even" method
        num1 = 3.5
        num2 = 4.5

        rounded1 = round(num1)
        rounded2 = round(num2)

        print(rounded1)  # Output: 4
        print(rounded2)  # Output: 4

        > 4
        > 4
                        

In this example, both 3.5 and 4.5 are rounded down to 4 according to Python 3’s rounding rules. This behaviour ensures a balanced distribution of rounding results, reducing bias and maintaining statistical accuracy. Now, let’s compare this with the traditional rounding method:

# Python
        # Traditional rounding
        import math

        num1 = 3.5
        num2 = 4.5

        rounded1 = math.ceil(num1)
        rounded2 = math.ceil(num2)

        print(rounded1)  # Output: 4
        print(rounded2)  # Output: 5

        >4
        >5
                        

In the traditional rounding approach, the number 3.5 is rounded up to 4, while 4.5 is rounded up to 5. This highlights the bias introduced by always rounding up and the difference in behaviour compared to Python 3’s “round half to even” method. By adopting Python 3’s rounding approach, we can mitigate bias, maintain statistical fairness, and achieve more accurate results in rounding calculations.

Banker’s rounding plays a crucial role in statistical calculations and data aggregation. When dealing with large datasets, rounding bias can have a significant impact on the summary statistics and overall analysis. By employing banker’s rounding, we can ensure fairness in aggregating data and avoid skewed results. For instance, when calculating the mean or average of a dataset, banker’s rounding minimizes the distortion caused by rounding bias, resulting in a more representative and reliable estimate. However, it is important to note that banker’s rounding may not be suitable for every situation. In certain contexts, alternative rounding methods may be more appropriate. For example, when dealing with financial transactions or pricing, rounding methods that always round up may align better with industry standards or legal requirements. Similarly, specific analytical techniques or mathematical models may necessitate the use of alternative rounding methods tailored to their unique requirements.

The “Round Randomly” Method

In addition to banker’s rounding, another alternative rounding method worth considering is the “round randomly” approach. As the name suggests, this method involves rounding numbers randomly when they fall exactly between two

  • Pros
    • Balanced Distribution: Similar to banker’s rounding, the “round randomly” method achieves a balanced distribution of rounded values. It avoids the systematic bias introduced by always rounding up and provides a fair representation of the data.
    • Statistical Neutrality: By rounding numbers randomly, this method does not favour rounding up or down, ensuring statistical neutrality in calculations and data analysis.
    • Minimized Impact of Bias: Rounding numbers randomly can help mitigate the potential impact of rounding bias and its consequences on various calculations.
  • Cons
    • Lack of Determinism: The random nature of this method means that the rounding results will vary each time the calculation is performed. This lack of determinism can make it challenging to replicate and verify results consistently.
    • Unpredictability: Rounding numbers randomly can introduce unpredictability into calculations, making it harder to anticipate and understand the rounding outcomes.
    • Potential Loss of Precision: In some cases, rounding numbers randomly may lead to a loss of precision compared to other rounding methods. This can affect the accuracy of calculations, particularly when high precision is required.

In Python, we can implement the “round randomly” method by utilizing the random module. The random.uniform(a,b) function generates a random floating-point number between a and b. We can round our original number by adding a random offset within the range [-0.5, 0.5] to introduce randomness. Here’s an example of implementing the “round randomly” method in Python:

# Python
        import random

        def round_randomly(number):
            return round(number + random.uniform(-0.5, 0.5))

        # Example usage
        num = 3.5
        rounded = round_randomly(num)

        print(rounded)  # Output: Either 3 or 4 (randomly rounded)
                                        

Each time the function is called, it will generate a random offset within the range [-0.5, 0.5] and add it to the number, resulting in a rounded value that can be either 3 or 4. By understanding the pros and cons of both “banker’s rounding” and the “round randomly” method, we can choose the most appropriate rounding approach for our specific needs, considering factors such as fairness, accuracy, and the nature of the data being rounded.

List of Python functions for rounding numbers

  1. round(x, ndigits=0): Rounds the value of x to the specified number of decimal places (ndigits). If ndigits is not provided, it defaults to 0, which rounds to the nearest integer. round(x, ndigits=0)
  2. math.ceil(x): Rounds the value of x up to the nearest integer greater than or equal to x.
  3. math.floor(x): Rounds the value of x down to the nearest integer less than or equal to x.
  4. math.trunc(x): Truncates (cuts off) the decimal places from the value of x, returning the integer part without rounding.

Conclusion

Rounding numbers is a common practice that simplifies numerical values, but traditional rounding methods can introduce bias towards higher values. Moreover, rounding numbers is a common task in many programming applications, and understanding the rounding methods available in Python can lead to more accurate and fair calculations. Python 3’s “round half to even” method, also known as banker’s rounding, offers a balanced distribution by rounding half the numbers up and half down. This approach minimizes bias and maintains statistical accuracy, especially when numbers fall exactly between two integers. By contrast, traditional rounding methods always round up, which can introduce a bias toward higher values. Banker’s rounding is particularly valuable in scenarios where precision and fairness are important, such as financial calculations, scientific experiments, or surveys. However, it’s essential to consider alternative rounding methods based on specific needs and contexts. The “round randomly” method offers a balanced distribution and statistical neutrality by rounding numbers randomly when they fall between two integers. While it mitigates bias, it introduces unpredictability and lacks determinism. The choice of rounding method should consider factors such as fairness, accuracy, the nature of the data being rounded, and industry standards or legal requirements.

• • •

👉👈 Follow me for more: Medium | Twitter | GitHub

📰 Subscribe for more posts like this: Medium | Clemens Jarnach ⚡️