Blog Closed

This blog has moved to Github. This page will not be updated and is not open for comments. Please go to the new site for updated content.

Thursday, March 18, 2010

Lean and Mean Parrot.

I got a comment from old-time Parrot contributor Melvin the other day in response to my criticisms of the PDD23 Exceptions system. He had a few things to say that really echo points that I've been trying to make in the past few months. I would like to reprint his comment here, in full (with some minor copyedits), and address it directly.

If IMCC needs to look into the op lib during lexical analysis, there is probably a design flaw. The opcode should probably only be checked during the parse / semantic check phase. If you want to tweak that much more performance out of IMCC you'd probably be better autogenerating the short op-list into imcc.l before calling lex so the token list is built directly into the scanner. As to your die vs. exit, I agree. This is where there is too much Perl influence in the VM. Things like 'die' semantics are better implemented in high level. Give the VM a simple opcode for termination, let all other things (printing messages, etc.) be implemented on top of the core op. I was arguing this in 2001/2002. If I had it my way, the core ops would be lean, and all the PMC grunge would have been implemented on top of those lean ops, not in the wacky .pmc/VTABLE mechanism that exists now. 99% of the Parrot lib would be written in a single, consistent HLL, and the JIT would be manageable. Right now, even as an original Parrot developer, I cannot wade in. Parrot requires a mix of Perl, PIR, C and custom build macros. PMCs are not Objects and Objects are not PMCs. There are still multiple cores for the fun of it. etc. etc. If Parrot is to ever really be done, seriously, someone needs to take a scalpel to about 50% of the "sacred" parts of the codebase and cut the amount of busy work and maintenance that has to be done, and remove some of the "multiple options" for accomplishing the same thing. Examples of projects that succeeded much faster than us are Mono. They built a canonical compiler (C#) and then proceeded in implementing the platform using the C# compiler. I was arguing that this would have been Cola at the time, and was my intent when I wrote PIR and the simple Cola compiler. But 8 years later, there is no JIT and Parrot is looking at using projects that started well after itself. I cringe when I look at all the hand-written .pir files in the distro. This was not the intent of PIR. Anyway, good luck. Don't take my comments the wrong way, hopefully they'll spark some debate. I'll check in in 2012. :)

I don't know how actively Melvin has been following IRC discussion, the mailing list, or this blog. I assume that he hasn't been following too closely, and we really can't expect a person to be following discussion for a project he's not really involved in actively. One of the big points we've been pushing for in recent months is Lorito, the lean set of low-level ops that we are going to use to rewrite large swaths of the codebase. In fact, seeing Melvin discuss this very idea is encouraging and validating: If other people come to the same conclusions that we come to internally, that means they are at least common ideas and maybe even good ones.

Lorito gives us the opportunity to move the abstraction layer down, and reduce context switches that occur by moving from one side of the line to the other. You don't have to look far in the PCC system, for instance, to see the machinations we have to go through to manage method calls from C. There are complicated structures of function pointers to access parameters from variadic argument lists. There are all sorts of complicated serialization paths to marshal arguments into the CallContext PMC, make the call, and then marshal the arguments back out of it again.

If we didn't have to worry about making calls from C, or if we at least didn't have to worry about making them regularly, we could avoid all this nonsense and delete several hundred lines of code, and the code we had left would be faster.

Another fact that comes up regularly is that the VTABLE interface is lousy. We've optimized for VTABLE access even though it's an extremely limited interface. There are simultaneously too many functions in the list, too few to support certain PMC types, and not nearly enough flexibility to support the kinds of interoperation mechanisms that we need. Ask this: How many mathematics-related VTABLEs exist? Now ask how many of them are even remotely useful outside the handful of numerical PMC types? Why does every single PMC type need to carry around pointers to all these functions that never get used? Why don't VTABLEs support proper MMD? Why don't all VTABLEs do any kind of MMD, even if it's wrong? Why do we have so much trouble inheriting VTABLEs between core PMC types and user-defined objects?

If we re-write the whole PMC system in Lorito, we can improve VTABLE dispatch semantics, maybe unify VTABLEs and METHODs, and expand the list of VTABLEs to hold any operations that we need. If we make the distinction that a VTABLE is like a method but only has a fixed number of positional arguments and can only be called from Lorito code, we could use a separate streamlined dispatch path that would be far faster than VTABLE overrides in PIR are now, only marginally slower than C-level VTABLES are, with the flexibility to plug into Parrot's MMD system for proper dispatch, and significantly more powerful and flexible than what we currently have. Plus, using hashes and named lookups, types could define a Sub of any arbitrary name to be a VTABLE, look up by any name we want to use with any arity we need for that operation, and we would only need one "VTABLE not found" handler to fall back to if a suitable one hasn't been found.

Runcores that don't build on every platform and are not well-tested by default should disappear. Sure, there is some benefit to having them as an academic exercise. Sure, it's cool to say we have an obscure and complicated dispatch mechanism with fun properties. But the reality is this: We're not using them. When we create a fakecutable binary for HLL projects like Rakudo, it generally hard-codes in the name of the runcore to use. So if our primary users can't select their own runcore, and if the primary way to use Parrot should be through HLLs (because I'll be damned if we want people in general to be writing PIR directly forever), then having multiple runcores is a huge waste.

Slow core? It does do bounds checking, but those bounds checks have never been seen to fail in my tenure as a Parrot coder. Drop it.

Computed-Goto and Predereferenced-Computed-Goto cores? Not all compilers even support these, and they require much more than their fair share of support code to operate at all.

Switch core? It's available everywhere and it's pretty speedy, but performance decreases pretty dramatically when used with dynops.

There has been some effort to make runcores pluggable using libraries, so a migration effort to move some of the uncommon ones out of core would be beneficial. I don't think we need them in core, but if other people would like to maintain them elsewhere, I won't complain about it at all.

Let's get one thing straight: including unused or rarely-used code in your program is not free. There are costs associated with having that code in memory. There are costs associated with the branches and decisions that choose to avoid it. There are costs associated in loading and dynamically linking exported symbols. There are costs associated with compiling all this code These things may not be super-expensive, but they are not free. And keeping code that has no practical benefit for developers or end users without a good reason to have it is not good practice.

Unnecessary opcodes need to disappear as well. We have well over 1300 opcodes now, probably closer to 1400 at this point. All of these add overhead in terms of memory footprint and lookup effort. It's a huge waste, and I suspect that if we did some actual profiling work we would find only about 200 of them were used with any regularity and about half would be almost completely unused except for one or two calls in obscure tests in the Parrot repo. If PGE and NQP don't use particular ops, and if there is no compelling need to keep them for particular HLLs, or if those HLLs can reasonably turn them into dynops, they should be removed.

Adding more ops is not a case of "it's a small added convenience with no downside". There are very real downsides.

Having too many core PMCs is a problem too. Parrot currently has 86 core PMC types by my count. I think we can drop this number to 60 without having to make any cuts that are too painful. If we wanted to be aggressive I think we could go even lower, but we do hit a point where we need to start weighing proportionally-smaller performance gains against the lost utility of removing PMC types that we're dependent on. In any case, I think we can definitely stand to move the following PMC types into dynpmc libraries or even delete them entirely: AddrRegistry, BigInt, BigNum, Boolean, Capture, CPointer, File, 9 of the array types, Key (to be replaced with an array type), Opcode, OpLib, OrderedHash, Pointer, Scalar, and Timer. This would bring us down to 63 PMC types, and several of these that I've listed are already either deprecated or are listed as experimental.

We as a community have definitely been moving in this direction. Allison especially has been helping to spread a "less is more" philosophy to the design of Parrot and I do appreciate that effort on her part. Making a switch over to Lorito, and then using Lorito to implement Ops, PMCs, and a few other core systems would be a major benefit. chromatic has been a major pusher for the Lorito effort as well, and having somebody of his stature behind the effort lends major credence to it. The effort to slim down the number of ops and PMCs doesn't have as much momentum but there is real evidence that we could be moving in that direction in the near future. I welcome all these changes and hope to see them come to fruition.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.