Monday, January 21, 2008

How slow is Java? Not as much as you may think.

I was reading an article on the D programming language (more on that in an upcoming post). At the end of it, the guy claimed it was close in performance to C++. And he used this benchmarking site as evidence. It measures a number of different programming languages on a couple of x86 platforms at performing various algorithms. Most of the algorithms are pretty intense, so it's a good measure of the raw compute power of their runtimes.

Now any such measure is easy to dispute. Did the programmers have sufficient knowledge in the languages to build implementations that would be efficient? They do allow you to play with different weightings for different factors that may be important to you. But the results are interesting non-the-less.

So, a couple of things caught my eye with the default results. First, C++ is faster than C. That I can see if you use C++ inlining a lot which is not always easy to do in C. But it is only slightly faster so I'd call it a draw.

Second, if you preallocate 64MB of heap (-Xms), which we often do, Java is only 1.7 times slower than C++. I think that is a very important result. We often wondered if the CDT's parsers were slow because they were written in Java. The IBM J9 guys said that was crazy talk and these numbers somewhat show their point. Well written Java that really benefits from JIT should be less than 2 times slower than C++. We were looking for a 10 times performance improvement and would probably have been disappointed if we had rewritten everything in C++ (and I mean disappointed in the career limiting aspects of that decision ;).

I don't know whether to fully trust the numbers on this site, but it does reaffirm my belief that Java isn't that much slower than C++. I still don't like Java, though. Show me something as powerful as the Standard Template Library in Java and I might change my mind. Or the 'foreach' from D...

14 comments:

  1. One of the things I like the most about D is that unlike C++ its designed to be easy to implement, while keeping all the best features of C++. Considering how ridiculously difficult it is to do good C++ tooling I think they are totally correct in this approach. And they killed the preprocessor, which is the best part of the entire language design.

    D is C++ without the insanity.

    I wonder if the D community would be interested in giving the CDT ILanguage extension point a whirl. Doug, do you think D would be a good fit for another language under the CDT umbrella?

    ReplyDelete
  2. Absolutely. In fact when I first heard of D, I suggested CDT support and someone in the D community thought I was already working on it :).

    Anyway, more thoughts on D in an upcoming entry. It is very interesting...

    ReplyDelete
  3. Doug,

    Have you looked at Scala at all? It might be a bit functional heavy for you, but it does have foreach along with map and other such features. Plus, it's JVM based as well.

    I've got a write-up on my blog (under the 'scala' label) for those that are interested.

    http://alblue.blogspot.com/feeds/posts/default/-/scala

    ReplyDelete
  4. Doug,

    This is kind of a trick question: If you need Java developers, based on your previous post, but you don't like Java, would you like me or what I do if I became one our your Java developers? Just teasing. This is a good posting...

    ReplyDelete
  5. Benchmarking Java is not that simple. The Garbage Collector may kick in any time and can affect the performance in undeterministic ways. If your program have complex graphs of throw-away objects (abstract syntax trees?), you can't get a reliable benchmark easily.

    In C/C++, memory allocation time are much more stable.

    ReplyDelete
  6. I am a Java developer, does that mean I hate myself ;). It is an irony I live every day...

    As for memory allocation times, I have proof that there are cases where Java memory allocation is faster than C++, i.e. small objects. Good Java VMs have a small object heap that is very fast to allocate from and has very fast GC. It's the big objects, like Strings that get you.

    ReplyDelete
  7. I am not teasing: this is a bad posting.

    The posts title is patronizing and judging by asking "How slow is Java"?

    The title implies: "Well, quite okay, but still sucks in comparison to C++."
    And surprise...thats exactly what you want to force feed in your post to the reader.

    The benchmarks at:
    http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&lang=all

    are about the most unfair comparison I have seen so far. Especially Java should do much better in a fair comparison. Even -Xms64M, which you mention on the plus side is in truth a performance killer.

    I agree, all benchmarks are somehow biased and unfair. Still if I would be asked to device a benchmark to be most unfair to Java, I would probably end up exactly with what they use.

    Read all of: http://www.javalobby.org/java/forums/m92082440.html

    and you see what I am talking about. These benchmarks are not just a little unfair but unfair by design!!!

    ReplyDelete
  8. It's good to see that people are passionate about their Java performance. Of course blog titles are meant to raise an eyebrow; it helps encourage readers to read the darned thing. It seems to have worked perfectly I would say!

    I spent quite a bit of time a while back dealing with folks who were asserting that JAXB was faster than EMF. It turned out to be very much a case of benchmarks designed to reach the desired conclusion. Not surprisingly, indexing into an array-based accessor's result is faster than indexing into a list. If you throw in the fact that the array is typed so you can skip the cast, not to mention the fact that the list implementation you're comparing it against, array list, is a wrapper around an array, the finding not a big shocker. But it isn't actually a comparison between JAXB and EMF. Change the benchmark to one that needs to add and remove items intensively and the results are entirely reversed. So the question becomes how would a "fair" benchmark balance the need for speedy indexing verses speedy adds and removes?

    Reliable performance statistics for Java are notorious difficult to collect at the best of times. One little change in the overall application and suddenly some important method isn't inlined anymore for the most mysterious of reasons. Suddenly casting becomes much more expensive with the addition of just one derived class.

    I firmly believe in the old saying that there are lies, damned lies, and statistics. Of course without statistics we'd only have lies and damned lies...

    Doesn't it seem to everyone that English has gotten a bit old and tired? Perhaps Frenglermanese might be a lot more expressive? We could make the spelling phonetic and try to get by with fewer letters. Most importantly, we could allow it to be written left to right or right to left, depending on the read or writer's preference.

    ReplyDelete
  9. All I know is that the new Lotus Notes 8 is now eclipse based and its gone from the worst application I've ever used to the worst and slowest application I've ever used.

    ReplyDelete
  10. Thanks for backing me up Ed :). I had thought I was pretty clear that I wasn't sure whether to trust the numbers.

    But as Mike points out, anicdotally, Java seems slower than C++ equivalents. And, especially in the embedded software industry, Java has a bad name because of that.

    And that was the point of my post. To show the embedded guys that it isn't "that" slow, and to show the rest that it is a concern for us.

    ReplyDelete
  11. """
    It's good to see that people are passionate about their Java performance.
    """

    Well I am not too passionate about "my" Java and use C++ more often anyway at the moment. Still, compared to Java and D, C++ is just broken.

    Anyway, when citing http://shootout.alioth.debian.org it isn't enough to subtly raise the possibility that the data may or may not be wrong. This link needs a fat warning sign shouting "biased-by-design".

    ReplyDelete
  12. Embedded software is different. You just can't do all those fancy optimization on the fly. GC is slow as hell on those device.

    ReplyDelete
  13. """
    Embedded software is different. You just can't do all those fancy optimization on the fly. GC is slow as hell on those device.
    """

    Very true. Still, it all depends for what purpose you optimize the your VM (not necessarily Java) and the GC process.

    Just have a look at Android, it is open source.

    ReplyDelete
  14. I love reading posts about C++ vs. Java performance. It seems the Java guys are always trying to prove that Java is as fast ( or faster in some cases ) than C/C++. What I've found 9 /10 times is the benchmark code they use for C++ is less than optimal. When you modify the C++ code so that it is really equal ( or close to ) the Java, the results are vastly different.

    ReplyDelete