When I logged on to IRC this morning, Infinoid had a commit comming through that would make everything better still. Here are some very impressive results that Infinoid has measured with his patch in place:
Infinoid> before: make coretest 84.97s user 61.91s system 32% cpu 7:35.52 total
Infinoid> after : make coretest 75.22s user 43.93s system 67% cpu 2:57.62 total
Whiteknight> so that's a factor-of-2 speedup on top of the 4x speedup that some
benchmarks are showing?
Whiteknight> Infinoid++ # Holy shit
That's right. In the branch, on top of the previous speedups we had, Infinoid has more then halved the running time of
make coretest
. This is a total of 8x speedup on the one benchmark that we had been using. However, this isn't even the most dramatic result. pmichaud had a doozie for us that I wouldn't have believed if I hadn't seen it (and I still might not believe it):
Infinoid> here's stats for pmichaud's benchmark:
Infinoid> trunk : ./parrot x.pir 12.37s user 1.67s system 56% cpu 24.747 total
Infinoid> branch: ./parrot x.pir 0.42s user 0.06s system 64% cpu 0.741 total
You read this correctly, at least one benchmark is running 50 times faster in the branch now. As they say on the weight-loss commercials, "results not typical", but impressive nonetheless. More results:
Infinoid> trunk : make coretest 63.07s user 36.61s system 26% cpu 6:15.27 total
Infinoid> branch: make coretest 61.79s user 31.34s system 63% cpu 2:26.49 total
So what exactly was the culprit, the change that Infinoid was able to make that sped Parrot up by more then double? The problem, in a nutshell, was
fsync
. A number of IO API calls were using Parrot_io_flush
, which internally would push the data to the OS and call fsync
to ensure that the data was completely written to disk. This is unnecessary, and fsync
was a huge slowdown. So what Infinoid did—brilliant in it's simplicity—was replace calls to Parrot_io_flush
with calls to Parrot_io_flush_buffer
, which would still write data out to the OS, but wouldn't call fsync
anymore.So what exactly is
fsync
used for anyway?
szbalint> fsync only makes sense after accumulating a largish chunk of data anyways
Infinoid> fsync only makes sense for databases and mail servers ensuring atomicity
...and neither of these things are really part of the guarantee that Parrot's IO system makes to it's users. Parrot does not specify that it's basic IO will be atomic in any way, nor does it specify that we will force a disk write after so much data has been pushed through. We certainly could add an opt-in mode to guarantee that (especially if we can get some of the buffering code factored out like we are planning), but it isn't a requirement for all cases right now.
So that's my little note this morning about our branch. Assuming testing keeps coming back positive, I'd like to get it merged in to trunk tonight so we can get started on the next round sometime after #parrotsketch tomorrow.
My understanding is that the poor performance of fsync is implementation, not inherent. The recommmendation not to call fsync is largely an ext3 legacy. It's hard to argue with current realities, of course, but fsync() should be more useful and performant in the future.
ReplyDeletehttp://lwn.net/Articles/328363/
Solving the ext3 latency problem
http://lwn.net/Articles/327601/
Linux Storage and Filesystem workshop
http://lwn.net/Articles/326471/
That massive filesystem thread