Whiteknight's World: Contexts: Path Forward and Problems

I mentioned the ongoing work for the Contexts PMC stuff in a previous post, and wanted to elaborate on that a little more here. Currently, as I've mentioned before elsewhere, context structures in Parrot are just basic data structures that are manually managed using an idiosyncratic reference counting scheme. The goal is to turn Contexts into a normal PMC type so they can be garbage collected, because currently we leak a lot of memory through miscounted contexts.

Coke started a branch to do the conversion, and I quickly realized that we are going to need to do the work in baby steps to get it right. It's really multiple projects: Encapsulating the Context PMC behind a good interface, cleaning up all the associated logic, possibly separating out the context structure based on it's various purposes, and then converting everything over to be a garbage-collectable PMC. It's worth mentioning here that there is a faster way to do this, but faster does note equate to better, or correct in terms of proper design, so I don't bother to discuss it here.

Contexts serve a number of purposes in Parrot:

They manage the register sets, and provide access to the Parrot registers. A new context allocates storage space for the register types, does the pointer arithmetic to access them, marks their contents for garbage collection, and cleans them up when the context goes out of scope.
Contexts hold pointer items to a variety of other data items: the current namespace PMC. The current invocant PMC on a method, the current return continuation PMC, current argument and return signatures, the current lexpad, and the current HLL ID.
Contexts form a type of linked list (actually, more of an acyclic graph) that represents the call chain. Every context also contains a pointer to an enclosing context, such as is the case in a closure.
Contexts keep track of control flow by holding a reference to the current PC, and also hold fields of flags that determine which errors and warnings are enabled currently.
Contexts provide access to the current list of constants in the packfile.

You can see that these uses aren't exactly orthogonal to one another but at the same time it's not strictly necessary that all this information be stored in exactly one data structure either. For instance, we could separate out the organizational aspects (call chain, current PC, current errors/warnings, access to constants, pointers to namespace/lexpad/sub/invocant PMCs) into one PMC type and separate out the register management functions into a different PMC type. This would have the benefit of allowing unrelated activities to be stored in separate PMCs, and would allow all the necessary functions to be hidden behind a reasonable VTABLE interface without too much shoehorning. I'm not saying that this is what we will do or even what we should do, but all possibilities are worth thinking about at this juncture.

It's especially worthwhile to think about when you consider the severe limitations stipulated by the VTABLE interface, which was clearly designed with "data" PMCs (Integer, String, Float) in mind and not "process" types like this. Start divviing up the various VTABLE functions and you run out pretty quickly. We can use the get_integer_keyed_int, get_string_keyed_int, get_number_keyed_int, and get_pmc_keyed_int interfaces, and the set_* equivalents to access the register set. The invoke VTABLE could be used to directly activate the RetContinuation, preventing us from needing a fast accessor for that item. We can still have an accessor for it, but it doesn't need to be a fast or dedicated one. The push and pop vtables can be used to manage the context stack for the common case of simple subroutine calls and returns.

That is, unfortunately, where things start getting messy. We could conceivably use the get_*_keyed_str as delegate accessors for the LexPad PMCs, to access lexical variables by name without needing a fast accessor to retrieve the LexPad PMC itself (again, we can have an accessor, but it doesn't need to be a hugely fast one). This single type of accessor doesn't give us the flexibility to restrict a search for a lexical variable to either the current context or all :outer containing contexts, but it's a start.

And even then, how do we access the current Sub PMC, the current NameSpace, the current invocant object, the Outer context (for closures), the parameter and return signatures, etc? The pcc_rewiring branch may offer some answers here, by encapsulating the current invocant, current parameter and current return signatures, and maybe other things. That's fine and good in the short term, but the long-term plan is to merge the Context PMC and the CallSignature PMC together into a single object that represents a subroutine state (with functionality to call into and call out of that state as well). And then we're back in the same mess with more data then we have interfaces to access it all.

Of course, this is all speculative, and it's a few steps away from where we need to be focusing right now. This kind of thinking does help to inform our short-term decisions, but we cannot be mired in them. Instead of finding creative and perhaps obnoxious ways to shoehorn all this functionality into the limited VTABLE interface, maybe instead we need to think about creating a different, and more appropriate, abstraction layer for Contexts. In short, we need to create an API for them.

Creating the abstraction layer should be very straight-forward since we know the kinds of things that we need to access. Once we have it, we can start migrating parts of Parrot to use the new interface instead of poking into the context structure directly. And once all accesses use the new interface, we can change the mechanics behind that interface to anything we want (such as converting the Contexts to a PMC type).

I posted a quick file to the trac ticket a few days ago that demonstrates some ideas that this API could follow and also hopefully demonstrates how easy one would be to create. It's not complete and it's not pretty, but it shows how it could be done. I will mention one possible problem area being in the .ops files, where several macros (IREG, NREG, PREG, SREG) are used to access registers as l-values and r-values interchangably, which makes a function-based "don't poke into my structures directly" interface kind of troublesome. So in the foreseeable future we will probably have to allow some of these same kinds of accesses, although it really would be nice to fix the OPS compiler to avoid these kinds of issues entirely.

So the first step is to create some kind of API for contexts, and then start the long and arduous process of converting Parrot's core codebase to use that API instead. Hopefully by that time pcc_rewiring will have landed and we can start planning the next steps.

3 comments:

Stephen WeeksAugust 5, 2009 at 10:02 PM
I've never been comfortable with shoehorning functionality into the vtable interface like that. That's what methods are for... is there any reason not to do most of that with methods?
WhiteknightAugust 6, 2009 at 7:59 AM
Ah, great question. Contexts are unique in the sense that they are intimately used for calling methods and subroutines. We need to create and populate a new context in order to call a method on an existing context. How then do we populate the fields of the new context, by calling yet another method in an infinite recursion?

In some way, Contexts are going to need to be treated specially because they are a core part of the method call mechanism. We don't want to create situations where we need to call methods in order to call methods, in the same way that we don't want to throw exceptions when the Exceptions subsystem has a problem, or using Parrot IO to print error messages when the Parrot IO system has a problem.

So there are two alternatives: First, a hackjob where "methods" on the Context PMC are treated differently from all other methods on all other PMC types to prevent recursive method calls. Second, we avoid method calls on Context entirely and instead create our own API (with the understanding that we still have encapsulation, just at a different level of abstraction then most PMCs).

So either way it's going to be a slightly different case for Contexts then we have for other PMC types. The second option is going to be better, I think, in terms of sheer performance because we don't need to add special-case logic to the method calling mechanisms to account for Context PMCs, and we don't need to go though all the effort to pass and process arguments in the way we do for method calls.

So that in a nutshell is why I don't think we can use methods for this application, although I would be very interested to hear if other people have different opinions on it.
AnonymousAugust 7, 2009 at 12:10 AM
Have you considered using the morph vtable interface to define a set of views over the Context PMC?

You'd have one datastructure, but multiple vtables. This would get you as many vtable functions as you need without having to squeeze functionality in awkwardly.

Note: Only a member of this blog may post a comment.

Blog Closed

Pages

Wednesday, August 5, 2009

Contexts: Path Forward and Problems

3 comments:

About Me

My Links of Interest

Followers

Blog Archive

Topics

External Activities