Smalltalk

multiple returns (maybe with decent formatting this time)

December 2, 2003 22:37:21.172

Warning: this contains code fragments with the pre tag, which look OK in a web browser but not in BottomFeeder

A general bit of Smalltalk stuff, prompted by a discussion from comp.lang.smalltalk. Usually I avoid threads with 50 messages a day, but I sampled one randomly and it reminded of a favourite topic. Credit to Anthony Lander for this insight. For the original, and some interesting material on doing 3-d graphics and video games with Smalltalk, see this link (PDF) . This part is right at the end, page 56-58.

Multiple returns is a language feature that lets a method return more than one value. It allows you to avoid having a method return a collection when all you really want is to return two values and get them into variables. Python is probably the most popular current language with this feature.

So, in Smalltalk, suppose that we had two collections and we want to know the elements in one but not the other, delete the old ones that are gone, and insert the new ones. We could do this as

	  beforeButNotAfter:= self findElementsOf: before notIn: after.
	  afterButNotBefore := self findElementsOf: after notIn: before.
	  beforeButNotAfter do: [:each | each delete].
	  afterButNotBefore do: [:each | each insert].

But apart from being verbose that's probably not too efficient. It'd be nice if we could do it in one statement.

	beforeButNotAfter, afterButNotBefore := self findNonOverlappingElementsOf: before and: after.
	beforeButNotAfter do: [:each | each delete].
	afterButNotBefore do: [:each | each insert].

Where the comma on the left of an assignment indicated that the method returned two values, and the first one went into the first variable, the second into the second variable. But in Smalltalk about the best we can do is to make a temporary collection

	result := self findNonOverlappingElementsOf: before and: after.
	beforeButNotAfter := result first.
	afterButNotBefore := result last.
	beforeButNotAfter do: [:each | each delete].
	afterButNotBefore do: [:each | each insert].

Which isn't nearly as nice. The insight, however, is that blocks give us a lot of the same kind of power. Consider

	self 
		findNonOverlappingElementsOf: before 
		and: after 
		doing: [:beforeButNotAfter :afterButNotBefore |
			beforeButNotAfter do: [:each | each delete].
			afterButNotBefore do: [:each | each insert]].

Here, instead of multiple return into temporaries, we have a block that defines block temporaries. Our values get into variables with nice names. In the original example that Anthony showed there was an additional wrinkle. This was in the context of 3-d graphics, computing intersections of collections of planes. But they might not intersect. By using the block we can encapsulate the test that's required for that condition, which we otherwise have to put into our code as e.g. a nil test.

For me, this was a real eye-opener. I've used blocks an awful lot, for an awfully long time, even using this kind of a pattern, but it had never occurred to me as being an alternative to multiple return.

miscellaneous

Re: Outsourcing to India in Business Week and at MIT...

December 1, 2003 11:45:19.128

As a description of corporate decision-making processes, this is too good....

Spotted in Philip Greenspun Weblog

Glorp

Glorp in #Smalltalk

November 29, 2003 11:57:16.571

It was just brought to my attention that Glorp is included as one of the examples with #Smalltalk . (#Smalltalk is an open-source ANSI-compliant Smalltalk implementation that runs on the .NET CLR, written by John Brant and Don Roberts).

This is pretty cool. Apart from the "I wish people would at least drop me a note when they do this kind of stuff" factor, my question is what this implies in terms of interoperability. I haven't looked too deeply into .NET or #Smalltalk yet, but one of the points is supposed to be that you can mix and match classes from different languages, including things like inheriting across languages. So I wonder if it would be possible to use Glorp in #Smalltalk to make non-Smalltalk classes persistent. Even if it's possible, I suspect there are a few small matters of implementation involved, but if it could be made to work, even with some limitations, that would be very cool.

...pondering the implications of putting a doesNotUnderstand: proxy into a C# class....

Glorp

More on Mapping Dictionaries

November 27, 2003 23:46:20.808

So, having defined at least some of out terms, how can we map this reasonably.

I'm speculating, because I haven't actually coded this, but here's what I'm thinking. Warning: this starts getting into deeply technical GLORP internals.

Assume a DictionaryMapping, which holds onto two sub-mappings, one for the key and one for the value. This dictionary mapping is going to build objects of some kind of special association type, call it GlorpAssociation. This holds onto a key, a value, and an owner, but it's not going to have a normal kind of descriptor, because it'll be reused in lots of different contexts. The dictionary mapping might or might not also need to know what the table is. It might just be implied by the sub-mappings.

So to read one of these, either directly, or as a join, we'd need to gather up all the fields needed for both the key and the value (and possibly the owner id) and we'd need to add in the join criteria, if any, needed for both if they're foreign keys, plus the join criteria to the association table. I guess that means the dictionary mapping would need to know it, since it doesn't seem like the sub-mappings would.

For writing, recall that GLORP assumes that each row is uniquely owned by a particular object. This is because GLORP builds up a "row map" of rows, indexed by their owner. However, one notable exception is the link table used for a many-to-many relationship. There's no unique owner for those. So what GLORP does is create an object, called a RowMapKey, whose sole job is to own these intermediate rows. Basically, a RowMapKey holds two objects (or more, but let's ignore that for now), computes its hash based on their hashes, and compares equality by comparing both of those for identity (regardless of their order within the RowMapKey).

This is going to be similar. The association objects we're creating won't have a primary key, so we can't really cache them. That means (or at least suggests) that we can't rely on their identity, and if we can't do that, we can't use them as keys in the row map. So we'll probably need to create RowMapKeys for the key/value/owner trio, or else we'll have to allow these GlorpAssociation objects to act in the same sort of way that RowMapKeys do (which could be tricky, because there's some internal ugliness to make that work, see e.g. the RowMap method isRowMapKey:, which does x class == tests).

Then there's the question of API's. Are there useful things you can do to query across a mapped dictionary? e.g. anyKeySatisfy:? Presumably, like an in-memory dictionary, anySatisfy: should operate on the values.

The other API question is how to write the descriptors for this. Can we have a sufficiently nice model that all 9 of these cases can be expressed gracefully. I'm not sure. Separating out into key and value sub-mappings seems to handle a lot of the issues, but there may be others lurking.

So, lots of questions. Not so many answers yet.

Glorp

Mapping Dictionaries

November 27, 2003 23:35:04.167

Today someone was asking about dictionary mapping in GLORP. That's one of those things for which there are lots of good intentions, but no implementation right now. There's been a shell class DictionaryMapping in there since almost the earliest days, but it's missing that critical element of actually *doing something*.

So this made me think a little more about dictionaries, and here's what I'm currently thinking. (Before the Americans start wondering if I have a family at all, and why I spend Thanksgiving thinking about such things, I should point out that while Canada does celebrate Thanksgiving, we celebrate it in October, a time more appropriate to our climate).

Anyway, dictionaries break down into a number of cases, but as far as I can see there are only two axes. So we have an object "the owner", which has an instance variable holding a dictionary. Then there's a table that defines the dictionary entries. We'll think of that as defining an association, and within that table, we'll identify the key and value for each association. For each of these, we need to know

  1. Is something a primitive (that can map into a single field), or is it a mapped object with a descriptor
  2. If it's a mapped object, is it embedded directly in the table, or is it a foreign key

We also need to be able to figure out who the owner is for a particular association. I don't think there's any variation there, though. We'll always need a foreign key to the owner.


From this we get 9 possible cases.
  1. primitive key, primitive value
  2. primitive key, embedded value
  3. primitive key, foreign key to value
  4. embedded key, primitive value
  5. embedded key, embedded value
  6. embedded key, foreign key to value
  7. foreign key to key, primitive value
  8. foreign key to key, embedded value
  9. foreign key to key, foreign key to value.

Let's make this a little more concrete. Case (2) is what I'd consider the simplest case of a dictionary mapping. It's basically a special case of a collection, where the key comes from the same place as the value. Note that GLORP doesn't (or shouldn't) care if the key happens to also be an attribute of the value. So if we have a dictionary of Customers, where each customer has an account number, and we use that as the dictionary key, then that's case 2. It's also case 2 if a Customer object doesn't know its own account number, but it's stored in the CUSTOMER table (if that's really what you want).

Case (9) is more like a pure link table. The keys and values are independent objects in their own right, say Departments and Employees, and the association table just associates them into a dictionary. It's like a many-to-many, except that there are 3 foreign keys in the link table, not just two.

Case (5) is pretty weird. We have full-blown objects for both key and value, but they're embedded into the association, so they can't occur anywhere else. Handling this would be pretty ugly, because it might violate GLORP's internal assumption that any row is uniquely owned by a single object. But that might be OK if one of these objects was treated as an embeddedValue of the other.

OK, that's enough for one post. In the next post, we'll consider how we can deal with this.

Smalltalk

How many Smalltalkers?

November 27, 2003 21:34:11.742

The other day, at our user group meeting, question that came up (more or less in the vein of asking how healthy the Smalltalk market is and whether I'm crazy for not just getting a real job) was how many Smalltalk developers there were. Specifically, are there 100,000 Smalltalk developers out there?

And I have absolutely no idea. I know there are at least 10, because I've met them in the last week or two. I'm pretty sure there are less than 10 million, because that's what Microsoft apparently claims for VB, and there have got to be a lot less Smalltalkers than that. I expect that Microsoft's accounting there is a little off, in that it probably counts people who are rather marginal as VB programmers - e.g. me. Still, it's some kind of a number.

So how do you count something like that? To make it easier, let's restrict it to Cincom Smalltalk users. You can't count VisualWorks developer licenses, because Cincom doesn't sell things that way. And even if it did, you'd be leaving out NC users. And Cincom probably wouldn't publish those kind of numbers if they had them. And that's just trying to count VisualWorks. Other dialects have their own distribution mechanisms, and lots of them don't count their users. Squeak, GNU Smalltalk, Smalltalk/X, #Smalltalk - they're all free downloads, and I think there's a lot of users out there. So I just have no idea how to calculate something like that, even as a wild-guess estimate.

So here's what I'm proposing. All Smalltalkers everywhere should send me an e-mail, and I'll count them.

I'M JUST KIDDING

... but does anyone have a better idea?

Object-Relational Mapping

Mapping going mainstream?

November 26, 2003 16:06:22.445

The other day I heard about Microsoft's technology associated with Longhorn for mapping. Today there's a join IBM-BEA spec called Service Data Objects that talks about mapping not only to relational sources, but also to XML. IBM page here .

Again, I only glanced at it, but it seems rather more limited, and more XML-oriented than relational-oriented. But it certainly seems like the need for this sort of technology is becoming increasingly mainstream. Maybe we'll start seeing the same sort of Microsoft/Java one-upmanship in features that we've seen in other areas.