In the previous article, I showed you that
"" + i is the fastest way to convert an int into a String in Java. All the way from Java 7 to Java 14.
Today you will learn what to consider in the opposite direction, i.e., when parsing a String to an int. You can find the source code for this article in my GitHub repository.
Parsing decimal numbers
Let's first look at the options to parse a String into an int (or Integer). Up to Java 7, we have these two methods:
The second method internally calls the first method and converts the result to an Integer object.
The String to convert must contain only digits, optionally with a plus or minus sign in front. These are allowed:
The following Strings are not allowed and result in NumberFormatExceptions:
Integer.parseInt("") // Empty string not allowed
Integer.parseInt(" 1") // Space not allowed
Integer.parseInt("3.14") // Decimal point not allowed
Integer.parseInt("1,000") // Thousands separator not allowed
Parsing hexadecimal and binary numbers
The above methods parse decimal numbers. To parse other number systems, the following overloaded methods are available:
radix specifies the base of the number system. A hexadecimal number can be parsed as follows:
And a binary number like this:
Signed vs. unsigned ints
In all the above cases, the number to be parsed must be within the range
Integer.MIN_VALUE (= -231 = -2,147,483,648) to
Integer.MAX_VALUE (= 231-1 = 2,147,483,647).
It becomes interesting (not to say: confusing) if, for example, we convert the valid int value 0xCAFEBABE into a hex String and then back into an int:
int hex = 0xCAFEBABE; String s = Integer.toHexString(hex); int i = Integer.parseInt(s, 16);Code language: Java (java)
This attempt results in the following error:
Exception in thread "main" java.lang.NumberFormatException: For input string: "cafebabe"
Why is that?
First of all: The String
s contains "cafebabe" as expected. Why can't this String be converted back to an int?
The reason is that the
parseInt() method assumes the given number to be positive unless a minus sign precedes it. If you convert "cafebabe" to the decimal system, you get 3,405,691,582. This number is higher than
Integer.MAX_INT and therefore, cannot be represented as an int.
Then why can we assign the number to the int variable
hex? The binary representation of the numbers plays an (intended) trick on us. 0xCAFEBABE corresponds to binary 11001010,11111110,10111010,10111110 – a 32-digit binary number with the first bit being 1. In an int – which is always signed in Java – the first bit stands for the sign. If it is 1, the number is negative (for details on negative numbers, see this Wikipedia article). Let's add some debug output to the code above:
int hex = 0xCAFEBABE; System.out.println("hex = " + hex); System.out.println("hex binary = " + Integer.toBinaryString(hex)); String s = Integer.toHexString(hex); System.out.println("s = " + s); int i = Integer.parseInt(s, 16); System.out.println("i = " + i);Code language: Java (java)
We see that
hex contains the negative value -889,275,714 (I've inserted the thousands separators here for the sake of clarity). Hexadecimally, this negative number is represented as the positive value "cafebabe", which in turn cannot be converted back by the
To make this work, after all, the language creators added the following methods to Java 8:
int i = Integer.parseUnsignedInt(s);(→ JavaDoc)
int i = Integer.parseUnsignedInt(s, radix);(→ JavaDoc)
These methods allow us to parse numbers in the range 0 to 4,294,967,295 (= 0xffffffff hexadecimal or 32 ones in the binary system). In Java 8, we can adjust the penultimate line of the above example as follows:
int i = Integer.parseUnsignedInt(s, 16);Code language: Java (java)
As output, we don't see 3,405,691,582. Rather, as the Java int is always signed, -889,275,714, which is the same value we get when we assign 0xCAFEBABE to an int.
And how do we get to 3,405,691,582? Therefore we have to parse "cafebabe" (or "CAFEBABE" – the case is insignificant) into a long:
long l = Long.parseLong(s, 16);Code language: Java (java)
Finally, what does 3,405,691,582 look like in binary and hexadecimal notation?
System.out.println("l binary = " + Long.toBinaryString(l)); System.out.println("l hex = " + Long.toHexString(l));Code language: Java (java)
Again, we get the same representations as for the int value -889,275,714, i.e. 11001010,11111110,10111010,10111110 and "cafebabe". The same binary or hexadecimal number thus leads – depending on whether it is stored in an int or in a long – to a different decimal number (if it is larger than
Integer.MAX_VALUE). In the following section, we'll take a look at some more examples.
parseInt() vs. parseUnsignedInt()
To illustrate the difference between
parseUnsignedInt() once again, I wrote a small program, which you can find here in my GitHub repository, and which parses different (threshold) values using both methods.
In the following table, you find the result summarized (the dashes stand for NumberFormatExceptions):
|-2147483649||Integer.MIN_VALUE - 1||—||—||—||—|
|4294967295||2 * Integer.MAX_VALUE + 1||—||—||-1||ffffffff|
|4294967296||2 * Integer.MAX_VALUE + 2||—||—||—||—|
One can see here well:
- In the range
parseUnsignedInt()return the same results.
parseInt()also covers the range up to
Integer.MIN_VALUEand returns exactly the value passed.
parseUnsignedInt()covers the range up to
2 * Integer.MAX_VALUE + 1– with the result in the range above
Integer.MAX_VALUEalways being a negative number. Its hexadecimal representation, converted to the decimal system, corresponds to the input value.
Auto-boxing and -unboxing the result
We have seen above that there are separate methods to convert a String to an int primitive or an Integer object. But what happens if we use the wrong method?
Integer i = Integer.parseInt("42");
int i = Integer.valueOf("555");
The first case is not particularly elegant but does not pose a problem either:
Integer.parseInt() works internally with primitive values and the result is – just as with
Integer.valueOf() – eventually converted into an Integer object by auto-boxing.
The second case is different: here, the result is converted to an Integer object inside
Integer.valueOf() and then back to an int primitive when it is assigned to
i. IntelliJ recognizes this (Eclipse doesn't) and displays a warning with the recommendation to replace
We will examine the extent to which the compiler or HotSpot forgives us for this error in the next chapter, "performance".
Performance of the String-to-int conversion
Similar to the last article, I did the following comparison measurements with the Java Microbenchmark Harness – JMH:
- Speed of various String-to-int conversion methods with Java 8:
parseInt()with positive numbers, positive numbers with preceding plus sign, and negative numbers,
parseUnsignedInt()with positive numbers and positive numbers with a preceding plus sign,
valueOf()with positive numbers,
parseInt()with subsequent conversion into an integer object,
valueOf()with subsequent conversion into an int primitive,
- Comparison of the
parseInt()method across all Java versions from Java 7 to Java 14.
Performance of various String-to-int conversion methods
You can find the source code of this test in my GitHub repository. The test results are in the
results/ directory. The following table shows the performance of the various method calls using Java 8:
|Method||Operations per second||Confidence interval (99,9%)|
|25,157,289||24,959,166 – 25,355,412|
|25,056,427||24,974,885 – 25,137,970|
|25,143,740||25,039,972 – 25,247,508|
|25,124,027||25,060,833 – 25,187,221|
|25,015,082||24,914,320 – 25,115,843|
|24,594,336||24,421,316 – 24,767,355|
|24,531,187||24,413,040 – 24,649,334|
|24,325,347||24,183,155 – 24,467,538|
As you can see, the first five measurements are almost identical. This can be explained quickly: the executed code is the same in all cases.
parseInt() with subsequent boxing are about 2% slower. This should correspond to the overhead for converting into an Integer object.
valueOf() with subsequent unboxing is about 1% slower, which means that neither the compiler nor HotSpot have forgiven us for the "boxing with subsequent unboxing" error.
Parsing negative numbers should be slightly faster because internally, negative numbers are added up, and in case of a positive number, the result is multiplied by -1. However, there are no differences in the benchmarks. Multiplying by -1 is apparently so fast that even at 25 million multiplications per second, this is of no significance.
Performance of String-to-int conversion across Java versions
Since in the end, all variants of the String-to-int conversion call
Integer.parseInt(), I have restricted myself to measuring the performance of calling this particular method across different Java versions. I used the same test class as for the previous test and commented out all methods except
integerParsePositiveInt(). I compiled and ran the code with the respective Java versions. You also find the results of these tests in the
results/ folder. Here is a summary of the results:
|Java version||Operations per second||Confidence interval (99.9%)|
|Java 7||25,223,117||25,069,748 – 25,376,488|
|Java 8||25,157,289||24,959,166 – 25,355,412|
|Java 9||22,580,117||22,471,102 – 22,689,132|
|Java 10||22,129,425||21,889,153 – 22,369,698|
|Java 11||23,657,228||23,494,292 – 23,820,165|
|Java 12||23,604,657||23,385,208 – 23,824,106|
|Java 13||23,626,048||23,473,823 – 23,778,273|
|Java 14||23,599,658||23,440,825 – 23,758,490|
Integer.parseInt() method became significantly slower in Java 9 (nearly 10%), again 2% slower in Java 10 and faster again in Java 11, but since then has stayed about 5% behind the performance of Java 7 and 8. To confirm this measurement result, I ran all benchmark tests (which are repeated 25 times anyway) again – with similar results.
In search of the cause, I first compared the source codes of
Integer.parseInt() of all Java versions. Versions 7 and 8 are identical. In Java 9, the code was slightly restructured, e.g., variables were declared elsewhere. The algorithm itself was not changed. The minimal code changes should not affect performance. From Java 9 to the Early Access Release of Java 14, there was no further change, except that in Java 12, the radix was included in the error message for non-parseable numbers.
To check whether the changes in Java 9 affected performance, I copied the
Integer.parseInt() source codes from Java 8 and 9 and tested these copies with JMH. Both were the same speed. (This test is not in the GitHub repository because I don't know to what extent I can publish Java source codes.)
In another experiment, I compiled the
Integer.parseInt() source code with Java 8 and ran the resulting class file with Java 9 to 14. This led to a similar result as the initial performance test, i.e., Java 9 and 10 were slower, and Java 11 was a bit faster again. The reason for the different speeds must, therefore, lie within the JVM. If any of you know the exact cause, I would be happy to receive an enlightening comment.
In this article, I have shown how to parse numbers in decimal and other number systems and what the difference is between
parseUnsignedInt(). Be careful not to box unnecessarily from int to Integer, or vice versa, or – worst of all – both in a row. If you find the article helpful, I'd be happy if you shared it with one of the following share buttons.