The C programming language is a gift that keeps giving. Today, we are going to see how a seemingly banal and common operation can hide unfathomable depths of unmentionable horrors.

## Woes with Integer Coercion

What is the problem of the code below?

```
// returns the closest integer to f
int float2int(float f) {
return (int) f;
}
```

It’s written in C.

A Rust enthusiast

That’s… a good point.

But let’s assume we actually want to write C, for some forsaken reason. And we do not care whether we convert up, down or sideways.

There is still a fundamental issue with this code. What happens when we call `float2int`

with `INFINITY`

?

```
#include <math.h>
#include <stdio.h>
int float2int(float f) {
return (int) f;
}
int main(void) {
printf("%d\n", float2int(INFINITY));
printf("%d\n", float2int(-INFINITY));
printf("%d\n", float2int(NAN));
}
```

When we compile this code with `gcc`

and run the result, we get:

```
$ gcc -O3 test.c && ./a.out
2147483647
-2147483648
0
```

Makes sense! The largest integer is

Someone still young and innocent`2147483647`

; the smallest is –`2147483648`

, and what would you choose for`NAN`

but`0`

?

Now, let’s just remove that `-O3`

option:

```
$ gcc test.c && ./a.out
-2147483648
-2147483648
-2147483648
```

What?

The innocent who got their world shattered

Wait, it gets better:

```
$ clang -O3 test.c && ./a.out
1464089272
1488257696
1488257696
$ ./a.out
1259459480
-1806736736
-1806736736
$ ./a.out
-2098630344
1664811680
1664811680
```

But… why?

All hope is gone

Because it can.

Yup, that’s undefined behavior. Converting a non-finite value (i.e. an infinite or a NaN) to an integer is undefined behavior.

But that’s not all! What should the largest finite floating-point value (`3.4028234664e+38`

) convert to? It’s much bigger than `INT_MAX`

, the value that `int`

can represent (it does not matter whether you are on a 32, 64 or 128 bit architecture, really).

Maybe we could just say that all the floating point-number larger than `INT_MAX`

should be converted to `INT_MAX`

when converted? But alas, it makes for unwelcome surprises down the line, when you realize that the rounded value is not always within `1`

of the original value.

Thankfully, the C standard always has the solution whenever there is a hint of ambiguity:

When a value of integer type is converted to a real floating type, if the value being converted can

be represented exactly in the new type, it is unchanged. If the value being converted is in the

range of values that can be represented but cannot be represented exactly, the result is either the

nearest higher or nearest lower representable value, chosen in an implementation-defined manner.Part 6.3.1.4 paragraph 2 of the C standard

If the value being converted is outside the range of values that can be represented, the behavior is. Results of some implicit conversions may be represented in greater range and precision

undefined

than that required by the new type (see 6.3.1.8 and 6.8.6.4).

In other words, we’re just going to say all the values that are not quite that convenient trigger undefined behavior. The “range of values that can be represented” by `int`

is the interval `[INT_MIN, INT_MAX]`

. Attempting to convert any `float`

value from outside this range to the `int`

type is undefined behavior.

We’ll just have to check for that!

Someone has not learned their lesson

## Woes with Range Checking

We are just getting to the actually tricky part. Let’s have a look at a seemingly robust implementation:

```
#include <limits.h>
#include <math.h>
int float2int(float f) {
if (isnan(f)) {
return 0;
}
if (f < INT_MIN) { // also filters out -INFINITY
return INT_MIN;
}
if (f > INT_MAX) { // also filters out +INFINITY
return INT_MAX;
}
return (int) f;
}
```

For the sake of simplicity, we are just providing arbitrary values for the values that we cannot convert. Other implementations might choose to return a status indicating whether the conversion is possible or not.

But, of course, there is a bug lurking in there. And it is extremely easy to miss.

And, this time, it is not caused by an undefined behavior, but by another surprising “feature” of C: implicit type conversions.

To understand what I mean by that, we need to look at what the conditions in the code above really mean. The first one is fine, let’s look at the other two: `f < INT_MIN`

and `f > INT_MAX`

. In both, the left operand is a `float`

, and the right operand is an `int`

. Processors rarely have such built-in operations. So a conversion must be taking place. The way it happens in described in the “usual arithmetic conversion”:

[…]

if the corresponding real type of either operand is float, the other operand is

converted, without change of type domain, to a type whose corresponding real type is float.[…]

Part 6.3.1.8 paragraph 1 of the C standard

Thankfully, we only need the part I have quoted here. In short, both operands get converted to `float`

. Let’s look at our conditions again.

```
if (f < (float) INT_MIN) { // also filters -INFINITY
return INT_MIN;
}
if (f > (float) INT_MAX) { // also filters +INFINITY
return INT_MAX;
}
```

Let’s look at the first condition. What is the value of `(float) INT_MIN`

? Let’s assume 32-bit 2-complement `int`

type. Then `INT_MIN`

is -2³¹. A `float`

can represent this value exactly. You can check it out with an online converter. So `(float) INT_MIN`

is -2³¹ as well. No problem here, this conversion does not change the value, and this code does exactly what we want.

With the same assumption, `INT_MAX`

is 2³¹ – 1. And there comes the catch: a float *cannot* represent 2³¹ – 1 exactly. If you put “2147483647” in the online converter, you will see that it is rounded to a float whose value is actually “2147483648”. It makes sense: a float trades off being able to represent all integers in [-2³¹, 2³¹ – 1] in order to cover a much wider range and many more magnitudes. In other words, the actual value of `(float) INT_MAX`

is `INT_MAX + 1`

. The condition is actually doing:

```
if (f > INT_MAX + 1) { // also filters +INFINITY
return INT_MAX;
}
```

**Note:** if this were C, this code would contain an undefined behavior because of the signed integer overflow that happens when evaluating INT_MAX + 1. This is just pseudocode for illustration purposes.

Since the upper bound is offset by one, the code fails to filter out the half-open interval `(INT_MAX, INT_MAX + 1]`

. If the input parameter `f`

were to take its value from this range, converting it to `int`

would be undefined behavior. Now, is there any `float`

value in this interval? Yes! INT_MAX + 1 of course (it’s the only one).

## Conclusion

The “robust” implementation does not handle 2³¹ properly. This is not just a theoretical oddity. I encountered this issue in critical code in production while working at TrustInSoft. Such bugs will easily stay under the radar and avoid even the most intensive test campaigns. Only formal methods help in detecting and avoiding such errors.

The worse thing is that, since this is undefined behavior, you might encounter it many times, and never now, corrupting data in unknown ways. Things that could have been a runtime error become the entry door for silent bugs and security issues.

C likes to keep you on your toes.