Glorp

Public Repository Utilities

May 22, 2004 12:30:27.606

Eric Winger, for his very first blog entry, talks about possible utilities for the public Store repository.

At one point he says "it's just a chunk of unorganized files". No, it's not. It's a bunch of records in a database, which is an entirely different thing, and much easier to manipulate, at least in theory.

Among the possible improvements, he mentions

  • A description field for packages/bundles. Well, there already is one, it's called the comment. The biggest problem with utilities that use it is that the public store uses postgresql, and the VW postgresql driver stores long strings by base 64 encoding them. This makes it rather difficult to do queries against them. I've talked about this with Bruce Badger, and he agrees it was a mistake to do it that way, but changing it, unless done very cleverly, would break older clients, so it's a big deal. One possibly easier way to approach this would be to have something crawl the database (like SmalltalkDoc), write out the comments, and then use the google api to search amongst the comments. This would also have the benefit of giving you a powerful search without having to try and discover much structure in the comments. Generating the pages would be easy. I don't know what's involved in hooking to google or the like.
  • "Change the Package/Bundle browser to allow packages that are inside other bundles to not be displayed." That's doable, although it does slow down the initial query against a large database some (e.g. to get the otherwise unfiltered list from the public repository it's about 10s instead of 1s. Against the internal Cincom DB it's about 13s instead of 5s). It's also not always quite what you want, because if anyone ever puts something inside a bundle then it will remove that thing from the filtered list forever. But it certainly helps. If you've used the StoreForGlorpReplication stuff coming in 7.2.1 you'll notice that's there as a checkbox. It would be relatively simple for anyone good at GUI's to make something more like a published items list/repository browser using the same mechanisms.
  • A couple of comments about splitting the repository based on date updated, type of package, etc. I don't think that's a good idea. Better to be able to filter on fairly arbitrary criteria

Glorp

Index of Public Repository

May 22, 2004 20:10:08.116

Code speaks louder than words, so having a bit of time free late on a Saturday afternoon I implemented the (most basic imaginable) version of the repository crawler and google search of the public repository that I mentioned earlier. I've put it up here. The google search part doesn't really work yet, presumably because we have to wait for google to actually index those pages. But hey, it's free and easy to do.

The only restriction I could figure out for google was to restrict it to a particular domain. I really wanted to restrict it to just those pages, so I put in a PackageDescription as a default entry in the search field, figuring that was a fairly unusual word that occurs on all of these pages, so it ought to serve as a restriction.

It would be easy to make the HTML nicer (it's really, really, really basic), and to add in additional interesting information (e.g. when it was last published, by whom, possibly some of the interesting package properties). The code is published in StoreForGlorp, as the RepositoryCrawler class (all 15 methods of it).

This is all quite reminiscent of SmalltalkDoc, which Mark Roberts and Rich Demers have been working on, although that's more targeted at being a sort of PythonDoc equivalent for looking up API's rather than a repository search mechanism. Having a very limited amount of information there to search can actually be a plus for that.

Object-Relational Mapping

JDO, EJB, etc.

May 25, 2004 16:13:38.016

There seems to be a great deal of fun going on in the Java community over JDO, EJB, and the like. JDO is an OO database influenced spec that's been moving in the direction of handling O/R persitence as well. I always had some issues with the way they approached things, but they were in the realm of legitimate technical differences. But it was always EJB that had the real mindshare, even though technically it was a disaster area. Now it appears that Oracle (i.e. TOPLink) and an open source product called Hibernate with a similar sort of architecture are abandoning efforts to work with JDO and instead have managed to influence the direction of EJB 3.0 towards a much lighter-weight persistence mechanism.

It's all very interesting, looking in from the outside, and there's lots of entertaining name-calling. Most of what I saw was in a thread on theserverside.com, but alas that thread seems to have expired or been purged while I was off sick.

Anyway, I think things will get interesting in that world. I don't know why Oracle and Hibernate chose to go with EJB. I know that there were some tensions with the JDO approach, but tensions doesn't begin to describe the problems with EJB. On the other hand, EJB's persistence has by now failed spectacularly enough that the committee seems willing to admit the previous two approaches were a bad idea and do a complete rewrite listening to someone who knows something. So perhaps they'll come out with something reasonable (but of course, also backward compatible with the previous two). Given my expriences with that group, I'm doubtful. Fortunately, it doesn't really affect me one way or the other.

Object-Relational Mapping

Application and Integration Databases

May 27, 2004 22:49:19.010

This is an interesting thought on Database Styles, spotted in Martin Fowler's bliki. One of the perpetual battles in application development is between the "database people", who believe everything should be centralized in the database, and that everything should be manipulated through stored procedures, and the "application people" who consider the database just a store for their objects. This helps explain the difference in attitudes. I don't know that he's right that a move towards decoupled services will make application databases more common, but I don't know that he's wrong.