Question 1: Why Is L1 Better Then PASM?
Another common rephrasing of the question would be "Why don't we just extend PASM to do what we want L1 to do?". I would turn the tables and ask "What is so good about PASM that we can't even conceive of a replacement?". Popular lore has it that Thomas Edison churned through 10,000 failed designs before he came up with the working lightbulb. We should not be so foolhardy as to think that we can't have made a fundamental mistake with PASM.
The problem with PASM is that it's the wrong level of abstraction. Frankly, it's too high level and is focused more on ease of use by the programmer then ease of execution by the VM. Don't believe me? I suggest you look at all the hundreds of files that are currently written in PIR: Main programs for HLLs, runtime libraries in HLLs, NCI wrappers for libraries, the entire test suite, etc. If PIR wasn't so great for people to be writing, it would long ago have been identified as a pain point and replaced with something better. Writing PIR is currently a required part of building an HLL compiler, and very few people are complaining about that, or complaining enough to have it changed! This in and of itself is proof that PIR (which is just PASM in disguise) is more suited for use by humans then by the machine.
And it's not like we don't have the tools or the skills necessary to write a replacement language compiler. In Parrot world you can't shake a stick without hitting somebody who has written one. If PIR was too low-level, we would have a compiler for a replacement language prototyped in less then 5 hours. Seriously.
We can focus on execution performance to the exclusion of human readability because people shouldn't be writing it directly. Let me repeat: we have the tools, the expertise, and the experience as a community to write a higher-level language compiler to do the hard work for us. Regardless of what intermediate language we end up with on Parrot, I don't expect humans will have to be manually writing any of it.
Question 2: Won't another layer make things slower?
Here's what we have right now:
People assume that I'm planning about just adding another compiler step and then another target interchange format for L1 at the end of this list. Not so. PCT has the potential (though it hasn't been implemented yet) to output code in any format. Why don't we do this instead:
And if you see that image and ask "Why do we need PIR/PASM in this case at all?", now you're thinking along the right lines. Imagine a world where we never ever need to write anything in PIR or PASM by hand again. Now imagine that same world where our tools like PCT don't output PIR as an intermediate form but, unbeknownst to you the programmer, directly outputs executable bytecode like L1. In such a world as we are imagining here, do we need to keep PIR or PASM around? Do they serve any purpose? Maybe as a very small bootstrapping layer to build PCT in the first place, in which case there is a lot of bloat that we could throw away, and we would only need a PIR compiler for the miniparrot build step and wouldn't need to build IMCC into libparrot or the Parrot executable.
It's not that we're adding a new layer of complexity, we're starting to realize that PASM is the wrong level of abstraction for our needs in Parrot, so we are replacing it wholesale with L1.
Now obviously there are intermediate steps betwen here and there where things are going to be slower and more complicated: We will need to use PIR as an intermediate step until PCT is capable of outputting L1 directly. But let me ask this: Do we want PCT to output something that executes slowly, or something that executes more quickly, when all the necessary work is done?
Question 3: L1 is going to be lower-level then programmers like
No question that PASM/PIR are far more friendly to the programmer then any conception of L1 is going to be. PIR and PASM are supposed to be sufficiently low-level that people don't want to program in them, but at the same time they aren't enough of a pain point that we made a replacement language into a priority. If L1 is such a pain point for programmers as I think it will be, we will write a compiler for a better language so we never need to write L1 directly.
We already have NQP which is a thin Perl6-like language that works well with Parrot's execution model. We also have Close in development which is going to be a C-like language that does almost the same thing. Assuming both these two languages have all the capabilities, why wouldn't we use these to do all our coding and never use PIR, PASM, or L1 again? Remember, I keep saying this: We have PCT, one of the most powerful compiler construction tools ever conceived. We should be using that to make languages that insulate us from whatever intermediate language Parrot uses. We should be reaching a point where we never ever ever have to write or even read another line of PIR or PASM ever again. Ever.
And when we do reach that point where we are so completely insulated from it, it won't matter to the programmer what intermediate language Parrot uses, because we won't be seeing it.
Question 4: A lot of development effort is going to be wasted
This is a good concern, that a lot of things people have spent a lot of effort to develop are basically going to become obsolete. But I have to also ask the question: "Why don't we all just write in Fortran, considering that so many people have spent so much energy on that compiler?". The effort isn't all wasted, we did learn a lot of important lessons about Parrot and the right way forward, and we've used all the things we've developed to bootstrap the creation of better tools and better ideas.
PIR and PASM got us to a place where we have PCT and several HLLs in active development. That doesn't mean we need to be trapped underneath these languages forever. We use them as a bootstrapping layer to build better tools, and use those instead.
Question 5: How does L1 compare to C?
Alternatively, why are we rewriting all our C code in L1? Let me ask what is the difference between C code and Assembly code? Everything you can do in assembly you can do in C. You will probably need to call a library to do some things, but if you can do it in ASM, you can do it in C. A compiler converts both down to the same machine code that is executed by the same hardware. You can say that one is "faster" then the other, because your average compiler isn't always smart enough to generate "good" machine code, but given the proper optimizations, there should be no speed difference. In other words, given the exact same machine code, it doesnt matter to the processor whether that code was written by a human in assembly and assembled to machine code, or written in C and compiled into machine code.
All that information out of the way, let's conceive of L1 as being a portable assembly language with all the same exact capabilities as C. Given the same exact capabilities and a good compiler (for our purposes, a JIT engine is just a "compiler"), both sets will be able to do the same stuff, and can be converted down into the same machine code for execution by the hardware at the same speed.
So given that the two are equivalent, why have L1 at all, and why not keep everything in C? The difference is semantics and algorithmic complexity. It's a difference of where the primary control flow happens. Right now we have a register-based system (PIR) running on top of a stack-based system (C). Switching between the two is slow, and unfortunately we switch pretty often because neither one supports all the semantics we need. L1 gives us an opportunity to access the low-level details that C has (calling native functions, accessing memory pointers, etc) but to use the high-level semantics that PIR requires (register-based operation, etc). We gain the ability to keep control flow executing in only a single context, and to eliminate all the algorthmic complexity of switching between semantic contexts.
An L1 dispatcher handles all control flow, able to call C functions directly as needed and redirect to offsets in bytecode. It can handle function calls, exception throwing, vtable overrides, and other CPS magic without having to recurse down the C system stack. And we gain all sorts of potential for optimizations because we have a unified environment like this.
In short, L1 allows us to design a language with the power and flexibility of C and the semantics of PIR, which in turn will let us reduce algorithmic complexity at the global level.
Question 6: Why not just fix PASM/PIR?
We can't really fix PASM in place because we have conflicting needs: We simultaneously need to extend it in order to match all the power of C, and shrink it to make it easier to analyze, optimize and JIT compile it. What we would end up doing is growing PASM to include what I now think of as being "L1", and then we would need to shrink away most of what we now think of as being "PASM", and we would be essentially left with only L1.
Keep in mind that this is not necessarily a bad development path for us to follow, but the end result is L1, not what we currently know as PASM. We can start by creating an op library for the new L1 ops, start transitioning everything over to use them, and then start deprecating away our old PIR ops (or rewriting them in terms of L1). What remains at the end of this is the small selection of L1 ops and various HLLs (I'm including PIR/PASM here as an HLL) which compile to L1, and L1 is executed directly by Parrot.
Question 7: What does L1 Buy Us?
L1 is going to give us a number of benefits:
- Decreased algorithmic complexity, because we're not having to shuffle data between the PIR registers and the C system stack
- Improved, easier, faster JIT. A "fixed" JIT
- Potential to plug more easily into existing JIT engines (LLVM and libJIT)
- Potential for trace-based JIT, where we trace out "hot spots" in the code and JIT them with type-specific information to speed up dispatch.
- Potential for high-level optimizations including subroutine inlining, dead code elimination, etc
- Potential for context threading, where we try to align the VM with the control flow of the underlying machine, to maximize branch prediction and caching at the hardware level
- Potential for improved GC and resource allocation performance because it will be easier to analyze where memory is allocated and where it falls out of scope. This includes "escape analysis", where we determine the lifespan of a reference and are able to deallocate it more quickly and efficiently.
- Potential for easier profiling, because everything will be in L1, we only need one tool to analyze L1 control flow (which we can share with the trace-based JIT and the GC escape analyzer)