Blog Closed

This blog has moved to Github. This page will not be updated and is not open for comments. Please go to the new site for updated content.

Monday, June 8, 2009

IO Speedup, More Dramatic Still

Two days ago I talked about how the speedups in the io_rewiring branch were approaching 4x by some benchmarks. I posted an email to the list and got back a few replies which seemed to show the speedups were not so impressive as that. Specifically, partcl didn't appear to appreciate the work we've been doing, even if Rakudo did. No big deal, any speedups for any users would have been a win in my book, even if not all compilers on Parrot benefited from it. Some people winning is better then nobody winning. Luckily this isn't the end of the story.

When I logged on to IRC this morning, Infinoid had a commit comming through that would make everything better still. Here are some very impressive results that Infinoid has measured with his patch in place:

Infinoid> before: make coretest 84.97s user 61.91s system 32% cpu 7:35.52 total
Infinoid> after : make coretest 75.22s user 43.93s system 67% cpu 2:57.62 total
Whiteknight> so that's a factor-of-2 speedup on top of the 4x speedup that some
benchmarks are showing?
Whiteknight> Infinoid++ # Holy shit

That's right. In the branch, on top of the previous speedups we had, Infinoid has more then halved the running time of make coretest. This is a total of 8x speedup on the one benchmark that we had been using. However, this isn't even the most dramatic result. pmichaud had a doozie for us that I wouldn't have believed if I hadn't seen it (and I still might not believe it):

Infinoid> here's stats for pmichaud's benchmark:
Infinoid> trunk : ./parrot x.pir 12.37s user 1.67s system 56% cpu 24.747 total
Infinoid> branch: ./parrot x.pir 0.42s user 0.06s system 64% cpu 0.741 total

You read this correctly, at least one benchmark is running 50 times faster in the branch now. As they say on the weight-loss commercials, "results not typical", but impressive nonetheless. More results:

Infinoid> trunk : make coretest 63.07s user 36.61s system 26% cpu 6:15.27 total
Infinoid> branch: make coretest 61.79s user 31.34s system 63% cpu 2:26.49 total

So what exactly was the culprit, the change that Infinoid was able to make that sped Parrot up by more then double? The problem, in a nutshell, was fsync. A number of IO API calls were using Parrot_io_flush, which internally would push the data to the OS and call fsync to ensure that the data was completely written to disk. This is unnecessary, and fsync was a huge slowdown. So what Infinoid did—brilliant in it's simplicity—was replace calls to Parrot_io_flush with calls to Parrot_io_flush_buffer, which would still write data out to the OS, but wouldn't call fsync anymore.

So what exactly is fsync used for anyway?

szbalint> fsync only makes sense after accumulating a largish chunk of data anyways
Infinoid> fsync only makes sense for databases and mail servers ensuring atomicity

...and neither of these things are really part of the guarantee that Parrot's IO system makes to it's users. Parrot does not specify that it's basic IO will be atomic in any way, nor does it specify that we will force a disk write after so much data has been pushed through. We certainly could add an opt-in mode to guarantee that (especially if we can get some of the buffering code factored out like we are planning), but it isn't a requirement for all cases right now.

So that's my little note this morning about our branch. Assuming testing keeps coming back positive, I'd like to get it merged in to trunk tonight so we can get started on the next round sometime after #parrotsketch tomorrow.

1 comment:

  1. My understanding is that the poor performance of fsync is implementation, not inherent. The recommmendation not to call fsync is largely an ext3 legacy. It's hard to argue with current realities, of course, but fsync() should be more useful and performant in the future.
    Solving the ext3 latency problem
    Linux Storage and Filesystem workshop
    That massive filesystem thread


Note: Only a member of this blog may post a comment.