Don’t use += loops on Strings for duke’s sake…!
It’s always kinda shocking to see that such easy tasks, and what I’ve always thought to be “common knowledge“, are not that common among some students… One such thing is always using the += on strings in Java… C’mon, everybody knows that + and += are horribly slow. Yeah, I know that when used outside of loops, the javac compiler is able to optimize it, but most of the time I see people using it inside loops, to print some collection etc. And then they come to me complaining about “how slow java is”. Oh the horror, the madness…
Anyway, this post will illustrate, very painfully, how much slower += is… Let’s do some (idiotic, yet good enough for this case) tests. First, the version everybody who _knows_ java uses:
-
StringBuilder s = new StringBuilder();
-
for(int i=0; i<99999; i++){
-
s.append(i);
-
}
-
}
Let’s run it…
[ktoso@homunculus ~]$ time `java BuilderTest` real 0m0.209s user 0m0.137s sys 0m0.039s
Hmm… Quite all right I guess… And now back to the monstrous version some of my colleges keep writing:
-
String s = new String();
-
for(int i=0; i<99999; i++){
-
s += i;
-
}
-
}
compile and run…
[ktoso@homunculus ~]$ time `java StringTest` real 2m14.694s user 2m10.379s sys 0m2.254s
I even managed to make me some coffee before it finished… Why is that? For any real Java programmer its obvious, but for some students it’s not… String is immutable. Every “string modification” is creating a new string, with all the fuss with object creation/allocation, it really takes some time. It’s a “feature” of the language, so how does it fight this kind of problem? Yup, the StringBuilder is mutable, thus he is really supreme if it comes down to modifying strings.
So, next time before you come to me panicking about java/speed or some other nonsense, please RTFM first ;-)
Update: Let’s look into the generated bytecode!
Yeah, why not? Since we’re talking about an really important feature here, let’s get more into it… You can use javap to display the bytecode from an class file. I’ve checked both examples above with it and the main difference are, of course, the lines used for appending those strings… First the StringBuilder version:
-
3: dup
-
4: invokespecial #3; //Method java/lang/StringBuilder."<init>":()V
-
7: astore_1
-
8: iconst_0
-
9: istore_2
-
10: iload_2
-
11: ldc #4; //int 99999
-
13: if_icmpge 28
-
16: aload_1
-
17: iload_2
-
18: invokevirtual #5; //Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
-
21: pop
-
22: iinc 2, 1
-
25: goto 10
-
28: return</init>
Nothing weird here, right? We expected StringBuilder to be used and simply the append method to be called, quite “normal”. By the way, it’s quite interesting to see how actually java does things inside :-) Ok, now for the String += version:
-
3: dup
-
4: invokespecial #3; //Method java/lang/String."<init>":()V
-
7: astore_1
-
8: iconst_0
-
9: istore_2
-
10: iload_2
-
11: ldc #4; //int 99999
-
13: if_icmpge 41
-
16: new #5; //class java/lang/StringBuilder
-
19: dup
-
20: invokespecial #6; //Method java/lang/StringBuilder."</init><init>":()V
-
23: aload_1
-
24: invokevirtual #7; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
-
27: iload_2
-
28: invokevirtual #8; //Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
-
31: invokevirtual #9; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
-
34: astore_1
-
35: iinc 2, 1
-
38: goto 10
-
41: return</init>
Oh! First thing you’d notice, it’s now 41 lines and not 28 as it was when we used StringBuilder directly… Let’s go on and examine the rest… What do we have here… StringBuilder?! That’s a little less expected, as I thought that the compiler wasn’t able to guess that we need a StringBuilder in such an situation. Well, it guessed right. Even though we use += StringBuilder’s append is being called… And then it’s being converted to a String… And each time in the loop there’s a new StringBuilder (line 16).
To summarize: Even though the javac compiler does use the StringBuilder if it thinks that’s a good idea, id does it quite stupid. And in the end, we still end up with super slow code! Bottom line is, that you should “by hand” use the StringBuilder in your code. Note that even thought you might use + in your code, javac will still make this StringBuilder code :-) Sometimes better, sometimes worse – as shown in the above examples.
Hope this post was interesting for even the more advanced users out there – that’s what for the second part was for :-) If you have any comments, feel free to post them – it really makes me happy to see people commenting my stuff ;-)
PS: Hmmm… I wonder how GStrings would do in such an dumb test… Guess I’ll have to check someday.


May 4th, 2010 at 22:14
Today, i was checking if compiler will otimize second version of code (somebody was telling me that it works that way in “new” java). I also have made test with StringBuffer compared to StringBuilder – difference is also visible – not that clear but visible.