A seamless integration of VW's tracing collector with WebKit's reference counting.
The two remaining issues with my WebKit/VisualWorks integration are garbage collection and passivation. Garbage collection is solved, which is what this post is about.
The entire structure of WebKit is arranged in a Tree, starting with a WebView, which contains a Page containing a hierarchy of Frames, each Frame containing a Document which contains a tree of typed Nodes. Each such object is reified in Smalltalk as a separate class, which luckily can be autogenerated from the IDL, because there are lots of node types. The nature of the tree structure is further enforced be the reference counting memory management scheme. Counted references point from parent to child, and while there are references from child to ancestor, particularly from Nodes to their Document, they are not reference counted, and are maintained by the ancestor. The Document and it's Nodes can contain EventHandlers, which are also reference counted. Thus a critical restriction is that an EventHandler cannot contain a counted reference to a Node because that would, in the absence of specific lifecycle treatment, create an unrecoverable cycle.
Integrating WebKit into VisualWorks involves both pointers into the tree i.e. a Smalltalk reification of a WebView/Frame/Document/Node and their subclasses, and pointers out of the tree i.e. injecting Smalltalk objects as EventHandlers. The obvious technique is to have references from Smalltalk increment/decrement the referent's reference count, and for Smalltalk EventHandlers to not be garbage collected by Smalltalk as long as their reference count is > 0, just like the tree semantics require. Unfortunately this very quickly leads to uncollectable cycles, as previously described, and requiring the Smalltalk programmer to be aware of this ugly detail is unacceptable.
There is a solution, and the key is the tracability of the WebKit tree. The idea is to not allow the reference counting on out-going references to retain the Smalltalk event handler i.e. the event handler can be garbage even if the tree holds a strong reference to it. Instead the event handler objects are managed using VW's tracing garbage collector. Every Smalltalk Node or WebView reification conceptually has a reference to every reachable event handler in it's (sub)tree. If a Smalltalk event handler object can't be reached during a trace, then it will be collected, and the object's finalisation will remove it from the DOM. If the reference count reaches 0 on it's own accord before this happens, then the object is removed anyway and becomes garbage.
The implementation involves extending the tracing code in the Smalltalk collector. Currently it just follows inst-vars. If the object is a Tree Node then it needs to trace through the subtree, treating any Smalltalk objects referenced from the tree as virtual inst-vars, which are then traced in the normal fashion. Obviously the implementation needs to be a bit clever about only tracing a subtree once per sweep, probably by maintaining a virtual old/newspace marker for tree roots, but as they say: Make it work then make it fast.
Unfortunately there is one non-technical problem with all of this: how do I deploy something which requires such a change to the VM. I'd like this to be generally available, but that means NC users as well, and I'm not sure about the legal and financial issues surrounding that. Even if Cincom were to agree to put this in the VM, it would probably have to be a generic GC extension point, and then there's the issue of how to link in my specific tracing requirements. At the moment I'm not comfortable with the extension jumping back into Smalltalk via a class method because of the subject/object problem, especially during something as critical as GC tracing. Anyway, that's a little way down the track.