We talked about this issue a lot yesterday, and I forsee that we will do a lot more talking about it all in the future too.
chromatic envisions L1 as being a subset of PIR. That is, L1 opcodes are just regular PASM opcodes, but the subset of fundamental operations in which all other opcodes are defined. Only the L1 opcodes are written in C, and the rest, potentially, are written in terms of L1. I envision L1 as being a completely separate layer, and not being directly usable by the programmer. Specifically, I don't like mixing together opcodes that are written in C, and opcodes that are written in L1, and expecting Parrot to figure out the difference at runtime. I would like to see L1 that is written strictly in C, and PASM ops which are strictly written in L1. L1 not only is a small set of atomic operations, but they are very simple, easy to parse and decode, and extremely fast. I'm thinking about them being three-argument operations, which opens all sorts of doors for optimization.
Here's how I think execution would go using this kind of scheme:
- L1 ops are written in C and are all trivially simple: Individual operations or individual calls to internal functions, or something simple like that. They need to be trivially simple to write and trivially simple to JIT.
- PASM opcodes are written in terms of L1 instructions.
- At build time, PASM opcodes are "compiled" into a sequence of L1 byte code (L1BC) values in an array, instead of being "compiled" into standards-compliant C code and then compiled again into machine code.
- At execution time, user-defined L1 overlays are added so that existing ops and PMCs can be redefined or extended to add specific information if needed.
- At execution time, executing PIR code is converted into PBC, which in turn is used to index the sequences of L1 code that represents the ops. These sequences are strung together to form an execution stream of L1BC which is executed by the core (or JITted, or whatever).
Having all ops represented as frozen sequences of L1BC opens up a few options for us, some of which are very enticing indeed:
- A pseudo-JIT, where we convert PBC on the fly to a straight L1BC stream. This saves us from having to look up L1BC segments for each op repeatedly.
- We still store packfiles using PBC, because it's a more compact format (although we might consider a "super fast" frozen L1BC stream format too)
- Overridable and dynamically extensible ops. If you use a custom PASM op in your code (written in L1, of course), you can save that custom definition in your packfiles and then execute your custom ops on any Parrot. Currently, custom ops are written in C and form a dynamicly linked library which would need to be included with your program in order to execute properly. With L1, this mechanism all disappears. Also, you could override existing op definisions locally, by supplying a custom L1BC stream to use for it instead.
- Overridable and dynamically extendable PMCs. Same as above, but with PMC types instead of ops. PMC types can be defined at compile time and included in the packfile as a custom L1BC stream. No need to compile dynpmc dynamic libraries and shuffle them around, they can be included in the packfile directly.
- Serious optimization potential. If you're willing to spend a few cycles optimizing, a unified L1BC stream makes for quite an attractive optimization target. Right now we can't optimize much because execution is flowing between C and PIR and it's just impossible to make any predictions. As chromatic is quick to point out, we suddenly gain the ability to do escape analysis, code inlining, duplicate and unused code removal, and other optimizations that we can't do now. Plus, having more information about code execution will allow us to make a smarter garbage collector, register allocator, etc.
- Built-in PMCs and user-defined PMCs will have all their VTABLEs written in L1 instead of in C. the VTABLE structure contains pointers to L1 streams instead of C function pointers. Overriding a vtable method becomes transparent, and there is no functional difference at all between built-in and user-defined PMCs. Also, there is the potential that built-in PMC VTABLEs could be locally overridden and then restored later, or copied to another PMC type, or something else entirely. Lots of potential here.
- If PMCs and OPS are ultimately compiled into L1, there's no saying that they all have to be hand written in L1. They could be written in PIR, or NQP, or Perl6, or TCL. Imagine using Perl 6 to write parts of Parrot that run the Perl 6 compiler which itself is written in Perl 6? Bootstrapping indeed, and only the start of the potential uses!
So, how do we get from here to there? This is the million dollar question, and without a solid roadmap we may never be able to implement such radical changes to Parrot. On this point, chromatic has already written a partial plan, although he mostly focuses on rewriting PMCs in something other then C. I'll adapt his basic timeline, but add a few of my specifics (which, as I've mentioned before, may be at odds with what chromatic thinks, and may actually be quite rediculous):
- Create a PCT-based compiler for PMCs and OPs that reads the current mismash of C-like code and emits ordinary C code, which is what our Perl5-based build system does already.
- Develop a compiler that converts L1 to equivalent C code.
- Replace the current Perl5-based build system with the PCT-based one
- Change the PCT-based compiler to emit L1, and use the L1 to create the C code of the PMCs and Ops
- Develop a runcore (nanoparrot) which executes L1 directly, including all the necessary control flow to interact ops with PMC vtables, etc.
- Remove the L1->C step, and have the L1 code generated by the PCT-based OP and PMC compiler(s) be executed by Parrot directly.