Glorp

Index of Public Repository

May 22, 2004 20:10:08.116

Code speaks louder than words, so having a bit of time free late on a Saturday afternoon I implemented the (most basic imaginable) version of the repository crawler and google search of the public repository that I mentioned earlier. I've put it up here. The google search part doesn't really work yet, presumably because we have to wait for google to actually index those pages. But hey, it's free and easy to do.

The only restriction I could figure out for google was to restrict it to a particular domain. I really wanted to restrict it to just those pages, so I put in a PackageDescription as a default entry in the search field, figuring that was a fairly unusual word that occurs on all of these pages, so it ought to serve as a restriction.

It would be easy to make the HTML nicer (it's really, really, really basic), and to add in additional interesting information (e.g. when it was last published, by whom, possibly some of the interesting package properties). The code is published in StoreForGlorp, as the RepositoryCrawler class (all 15 methods of it).

This is all quite reminiscent of SmalltalkDoc, which Mark Roberts and Rich Demers have been working on, although that's more targeted at being a sort of PythonDoc equivalent for looking up API's rather than a repository search mechanism. Having a very limited amount of information there to search can actually be a plus for that.