One of the improvements I made in BottomFeeder in the last release was in memory consumption - it's lower now. Why is that? The answer gets into the configuration of ObjectMemory. One of the best things you can do is to read the class side comments in class ObjectMemory. In general, VisualWorks divides memory into 7 "zones" that are managed by the VM. The key for this problem lies in the following definition:
NewSpace is used to house newly created objects. It is composed of three sub-spaces: an object-creation space (Eden) and two SurvivorSpaces. When an object is first created, it is placed in Eden. When Eden starts to fill up (i.e., when the number of used bytes in Eden exceeds the scavenge threshold), the scavenger is invoked and those objects housed in Eden and the occupied SurvivorSpace that are still reachable from the system roots are copied to the unoccupied Survivor Space. Thereafter, those objects that survive each scavenge will be shuffled by the scavenger from the occupied SurvivorSpace to the unoccupied one, until such time that the aggregate size of these survivors threatens to make the scavenge pause excessively long (i.e., when the number of used bytes in SurvivorSpace exceeds the tenure threshold), whereupon the scavenger will attempt to speed up subsequent scavenges by moving some of the older surviving objects from NewSpace to OldSpace. We say that such objects are being 'tenured' to OldSpace.
New objects get created in eden, and then bounced to a survivor space. If the survivor spaces get filled too quickly, then objects get tenured into Old Space. Now, Old Space is the only part of Smalltalk memory that grows at runtime - and in garbage collection terms, the scavenger doesn't look to clean it up unless it's in extremis. So... if you create a lot of objects quickly - and don't have enough new space for them - they get tenured. This tends to grow memory that you really don't need. That's exactly what BottomFeeder was doing.
During the update cycle, here's what happens if a site gets queried:
- Fetch the XML source for the page
- Parse the XML Source into an XML document
- Convert the XML document into a feed object with items
- Process the items against what we already have for that feed, adding any new ones to it
That's repeated for each feed that gets updated. That's a lot of new objects, particularly if you have threaded updates on. In 3.7 and prior releases, New Space was too small - so lots of the transient objects (the XML document, the raw page source, etc) were being tenured as new space filled. With 3.8, I made sure that New Space is bigger. This had a big impact - I subscribe to 295 feeds, and it means that Bf now consumes 25MB less than it used to - and all just by looking at how my application used memory.
There can be big wins here for any application - both in footprint and performance. Have a look at the documentation for class ObjectMemory, and do some experimenting with the method #sizesAtStartup: