Blog Closed

This blog has moved to Github. This page will not be updated and is not open for comments. Please go to the new site for updated content.

Tuesday, July 14, 2009

Parrot4Newbies: Encapsulation

If you were anything like me, you probably rolled your eyes when your professors talked about all sorts of abstract concepts like "abstraction" and "encapsulation". At the time they were just words that didn't seem to have a lot of meaning; big, new vocabulary words that made the lectures more difficult to follow.

Well, now that I've worked on bigger projects then "Implement a merge sort algorithm on an array of 100 integer values in Java", bigger projects like Parrot, I have a better appreciation for what encapsulation is and why it's important.

There are many systems in Parrot which were written hastily and not always to the highest coding standards. I say that with full knowledge that often times concerns about performance and deadlines outweigh the need to produce beautiful and maintainable code. And in many cases a system is prototyped "quick and dirty" with the intent that it would be redone eventually, but immediately thereafter the coder moves on to different projects. This is a normal part of the development process (especially in the world of Perl!), but eventually is now.

Several of Parrot's subsystems are poorly encapsulated, if any attempt has been made to encapsulate them at all. For people who are decent with C, and are good at cleaning up code without having to make all sorts of functional changes, I have some jobs for you:

The Strings System

The strings subsystem is a big offender in terms of non-existant encapsulation. The "guts" of the STRING structure are poked at directly throughout the codebase, and many different locations throughout handle them differently. Parrot doesn't even really support read-only strings like it should right now, because there are too many places where strings are accessed to check for a read-only flag in all of them.

The string system needs to be encapsulated so we can make some much needed improvements to stability, performance, and capability in the future. Cleaning the API and abstracting the details behind a clean interface are the first steps in any future development efforts. The best part is that these fixes can be made incrementally, a perfect task for a new hacker.

Contexts

Contexts are very important things. They represent the current execution environment, containing the current register set, the set of scoped lexical variables, etc. Because of their importance as a central component in Parrot, and because the API is so woefully unencapsulated, any improvements to the Contexts subsystem are slow or even undoable.

And I've discussed on this very blog some of the big projects we have planned for Contexts.

Contexts are very central to Parrot, so working to clean and properly encapsulate them will take the intrepid coder on a whirlwind tour of the Parrot core internals: the calling conventions, JIT, runcores, lexical variables, exceptions, etc. It's a big job, and a great opportunity to explore the Parrot codebase and get your feet wet. It's also another task that can be done in small incremental steps.

The benefits however are many, and I'll be happy to talk at length about them to anybody who is interested.

GC String Allocator

I've got a dirty little secret to share: Parrot really has two separate GC cores. Or, two facets of a single core (the terminology doesn't really matter). The first, which people talk about the most, is the fixed-size header allocator. The second, which goes oft-unmentioned, is the string buffer allocator.

The GC system itself is pretty well encapsulated, but the individual internal components of it are not. Specifically, the string allocator relies on intimate internal knowledge of the fixed-size header allocator core, which means replacing one requires massive edits (if not complete replacement ) to the other. Separating the two out to use a cleaner interface and enabling one to be updated without affecting the other would be a major boon for Parrot. This one task would be a major help in writing new GC cores, which in turn would have a major performance boost for Parrot.

Conclusion

These aren't the only systems and subsystems in Parrot that need better encapsulation, almost all of the systems do. This is just a good representative set of such systems that are in the most dire need. Hackers new to the Parrot community should definitely try their hands and cleaning up interfaces to some of our systems, because a little bit of cleanup work can go a long way to helping Parrot grow.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.