Blog Closed

This blog has moved to Github. This page will not be updated and is not open for comments. Please go to the new site for updated content.

Tuesday, September 21, 2010

Parrot Fundamentals: Design

Several days ago I wrote a sizable blog entry about the Parrot project, many of the ideas therein were discussed at length with fellow Parrot Foundation board member Jim Keenan when he came down to my area for a face-to-face meeting. Today I'm going to expand on some of those ideas, especially with respect to Parrot's design and roadmap.

Earlier this year  I wrote a pretty negative review of PDD23, the exceptions system design document. Actually, that's something I really should repeat with some of the other design documents as well. There are certainly a few that deserve the treatment. Maybe I'll try to do that again soon.

Anyway, there is nothing special about PDD23; it isn't an especially bad example. That's telling, actually. Looking through that directory, I find that the documents generally suffer from four common problems:
  1. Vague or Incomplete. Some PDDs are so incomplete, vague, or filled with holes that they are absolutely unusable for forming the decisions of our developers. PDD25 (concurrency) immediately comes to mind as one that contains very little actual design content. The majority of text in PDD25 consists of descriptions of technology, and reads more like an excerpt for "Threading for Dummies" than a real design document. PDD05, PDD06, PDD10, PDD14, PDD16, and maybe even PDD30 would also get put into this category. Notice that some of these PDDs are labeled as "draft" (and have been for years). I'm not really sure what a "draft" designation means, or how we go about getting them out of draft. I'll expand on that later.
  2. Not Forward-Thinking. There are many cases of PDDs which act as little more than copies of function-level documentation. PDD09 is one that I am eminently familiar with which is in this category, and unfortunately I've written much of the text in that document myself. At the time I rewrote it, I really didn't have much of a concept for what a design document should be, and what it could be. PDD09 describes what the GC was about a year ago, and things have changed significantly since then (and are changing rapidly now). PDDs which are not forward thinking provide absolutely no direction to developers doing new work. PDD08, PDD11, PDD20, PDD24 and PDD27 are good examples of this.
  3. Not In Sync with Reality. Some PDDs do not match the current implementation of things, and never really have (unlike the case above, where the implementation once matched the design, but has surpassed it). We have to ask in these situations why the design does not match the implementation: Is the design a lofty goal which the implementation is approaching by increments? Is the design unsuitable, and practical concerns have dictated a different approach? Did the implementation pre-date the design, and no attempt has ever been made to change it? PDD17 and PDD23 have some of this. PDD18 is also, by virtue of having never really been implemented.
  4. Not Good Design. Some PDDs really just don't represent the kind of good design that Parrot needs in the long term. Think about how often and how loudly people complain about certain topics: The object metamodel (PDD15). The lexicals implementation (PDD20). Parrot's bytecode format (PDD13). Other PDDs may fall into this category too, as the implementation approaches the design and we start to find the flaws.
The first question I suppose we need to ask ourselves as a community is this: What is the purpose of these design documents? The second question we might want to ask is: Do we want to keep these documents at all? A good third question is, depending on the answer to the previous question:  How do we want to go about improving the design documents to be what we want and need them to be?

The Parrot design documents can really be one of two things. The first is a form of summary documentation. Basically, the design documents would be a set of documents that distills what Parrot is currently capable of, and how it can be used. In other words, "interface documentation", or "man pages". A variant of this is to use the design docs as an a posteriori way to justify decisions that are made off-the-cuff by developers after they've already made their commits.  A second possibility is that the design documents should be the forward-thinking technical goals of the project, the lofty goals that every commit and every release strives to reach, even if we never quite attain it perfectly. I think there is far more value in the later option, and I'll explain why.

First off, we have lots of function-level documentation. We have automated tests which read our source code files and verify that we have documented (at least in a minimal way) every single function. We also have lots of tests, although admittedly tests make for pretty lousy documentation in a general sense. They can be used as a kind of reference, of course, to see how something works, but you often need to know what you're looking for before you go out to find it. We have tons of code examples too. We also have an entire book to teach some things, though it could use some work in it's own right. Our documentation for how to use Parrot is not always great, but we do have it in plenty of other places that we don't need to use the PDDs for that purpose. If documentation is lacking, we should improve the documentation, not subvert the design documents to serve as another layer of it.

Forward-thinking, lofty designs are extremely valuable. Consider the example of a coder who finds a missing feature in Parrot and no design for such a feature available. So she takes it upon herself to come up with a design and implement it. Weeks later she emerges and shows the fruits of her labors to the community with a request to merge her work into trunk. We as a community look it over, find a few problems and, with good cause, we reject it as bad design. Not just something that we can tweak to become suitable, but something that is fundamentally wrong for us. Thanks but no thanks. Smell you later, Alligator. That contributor will probably storm off and will never be heard from again. Also, with good cause.

Now let's look at a similar example, except we as a community have done the work ahead of time of writing out our intended designs for this fancy new feature. We describe exactly what we want, and what any implementation is going to need. Our same intrepid developer follows this design, and when she emerges with her labored fruit, it is much more acceptable. With some feedback and small tweaks, it is approved and merged to trunk.

I don't want to say this is super common, but it isn't unheard of either for people to show up to an open source project, unannounced, with a gigantic patch for a new feature. It also isn't completely unheard of for those gigantic patches to be summarily rejected. The more common case is where we have interested and energetic developers showing up to the Parrot project looking for problems to tackle. Saying "We don't have JIT, we could really use one" is far more daunting than "we don't have a JIT, but we have a design, and a list of prior art that we would like to model on." Following a map to your destination is much easier than having to design your own map first and then try to follow it.

There's an issue of motivation too. A person is much more willing to start working on a project when they have certainty that they are doing the correct thing, and that the software they produce will be usable and desirable. It's much easier to follow a path, even a very bare one, than it is to cut your own. Not to mention that the destination is much more certain.

This year, Parrot had 5 GSoC students assigned to it. Of those five, four of them contacted me personally about specific projects I discussed here on this very blog before submitting applications. I don't take any credit for anything, I'm certain Parrot would have had several high-quality applications and projects without me. But I do know that people are more quickly and easily able to latch onto fully-formed ideas more than they can attach to nebulous and vague ideas. Also, people may not even be aware that their interests and skills align with things that Parrot needs until they know exactly what Parrot needs.

If we--as a community-driven open source project--want to increase the size of our developer pool (and I suggest that we should always want that), we need to communicate what we need and help prospective developers align themselves with those needs.

When a new person comes to the Parrot chatroom, or the Parrot mailing list, and says "I would like to get involved in a Parrot development project, what can I do?", we can say something stupid like "Look around and try to determine for yourself what needs to be done", or "We need everything". That's not helpful and not encouraging, even if it's the truth! Instead, we can say "Look, we have a list of projects that we've designed and prepared for, but we haven't been able to implement yet. Want to take a stab at it?" The former usually leads to a confused developer who never comes back. The later can lead to a new active, empowered, permanent member of our development team.

For the sake of this discussion, we'll accept the axiom that more active developers in the project is generally a good thing, and losing existing developers, or raising the barrier to entry so high that new developers do not join is a bad thing. I'll argue the point till I'm blue in the face, if anybody wants to take me up on it.

In this sense, having better designs and plans means a lower barrier to entry for new users since it's easier for them to find a project to work on and they can begin work with more surety that what they are doing will ultimately be desirable and acceptable to the community. It's also a good motivator for existing community members. When I finish a project and have some (rare) spare time on my hands, it's better for me to be able to go right down to the next item on a checklist instead of having to look around blindly trying to find something that needs to be done. Sure, there are ways I could focus my search, but I still have to hope that something obvious appears in my focused search that I can work on.

All that said, I think I can answer the next few questions pretty quickly:
 

Do we need these documents at all? Yes, I think we do. They can serve as an important tool to guide new and old developers alike. They can help inform and populate specific tasklists on the wiki and elsewhere, and serve as an organizational focus for teams of developers looking to improve specific areas of Parrot. Good design documents can also be used as a tool to initiate a bidirectional communications with our consumers: projects that use Parrot as an integral part of their tool chains. I'll expand on this issue in particular in a later blog post.

How do we want to improve these documents? The time when we can all sit back and wait for a design to magically appear from nowhere is over. Good riddance. We have plenty of people in our project who are not only capable developers, but actually pretty great software engineers and software designers. Beyond that, we have lots of people, both in our core project and throughout the ecosystem, who know all the kinds of things that are going to be necessary for projects built on Parrot to be successful. Getting enough smart minds together to tackle a design challenge should be trivial.

My suggestion is this, though it's only one possible suggestion and I am not going to argue at all about details. For each design, we form a team of dedicated developers who could be considered experts on the particular topic. It would be trivial, for almost any design document we have, to put together a team of 3-4 people with some expertise in that subject. With a team, we could go through a regular checklist to produce a decent design document:

  1. Survey existing research, and find papers and prior art that we like.We could do this as a community before even creating the design team.
  2. Get input from the ecosystem, and maybe a special advisory panel, to get an idea for what kinds of features are required, which features would be nice, and what kinds of things to avoid.
  3. Give the design team some time to prepare and present a draft to the community
  4. We all paint the bikeshed for a few days
  5. We accept the design (assuming we actually accept it), and start developing towards it
If we have Parrot developer design meetings (PDS) every 6 months, that would seem like a great demarcation for this kind of process. At one  PDS we identify an area that needs design help and assemble an initial team to pursue it. At the next PDS we check out the findings, maybe approve them, and start the concrete development work. If we need to look things over and push approval off, we can do it at a subsequent weekly #parrotsketch meeting.

In any 6-month stretch, we are working on a set of development priorities which have already been properly designed, and we are preparing designs to work on for the next six month stretch. Our last PDS meeting was April 2010. That means we're due for one coming up in October or November if we want to stick to a 6 month schedule. I definitely think we should try to have a meeting thereabouts so we can re-focus on our current development and current design priorities.

TL;DR: Parrot's PDDs are in a bad state, but they really do serve an important purpose and we need to make sure they get updated. PDDs should not be short summaries of other existing documentation. Instead, they should be forward-looking documents that describe what goals we are trying to reach. We need to get input from developers and also Parrot's consumers and end-users in shaping those PDDs. I'll post more about this later.

    2 comments:

    1. The majority of text in PDD25 consists of descriptions of technology, and reads more like an excerpt for "Threading for Dummies" than a real design document.

      I think a short overview of existing technologies is actually quite relevant: it's the "literature review", which will help readers in understanding the alternatives. What's missing then, is the actual decision about which technology to use, and the rationale for that decisions. Since Parrot's been around for many years now, and people come and go as a natural process, it's very important to document why certain decisions were taken, so that new people can understand and appreciate those decisions. All considerations made by previous designers, architects and others should be stored, otherwise it's done over and over again, people start questioning why things are the way they are, etc. Also, if over time things have changed, such as constraints w.r.t. hardware requirements, etc., then yes, decisions may be changed, and also the rationale for why a decision was changed should be documented.

      ReplyDelete
    2. You are right on both counts. A certain amount of literature review is very important. It provides the reader with the necessary background information to help arrive independently to the same conclusions that the designers did. That's incredibly valuable.

      Also, you are right in pointing out that the historical significance of changes to a design over time are important as well. I could take the easy way out and remind everybody that the PDDs are in the svn repo, and anybody who is interested can browse through the long and convoluted commit histories for that, but that really isn't a decent response.

      When we update and modify our design documents it will be important that we include our justification: Why we rejected the old design (if any), why we chose to do things the way that we did, what the trade-offs are, etc. A good first step in this regard might be to distill down the current PDDs into relevant summaries, and then include those summaries in future design documents.

      ReplyDelete

    Note: Only a member of this blog may post a comment.