|[ANN] LibXSLT for VisualWorks|
February 12, 2004, 2:35:02 am
I've just finished publishing LibXSLT to the public store. So I'd like to talk about a few things. (a) what it is, (b) what it can do + examples and (c) how it was developed and the trials and tribbleations involved with that.
For those of you not familiar with XSLT, take a look at http://www.w3.org and its section on XSLT. For those of you that are, LibXSLT is an XSLT processor as well as an XML parser and much much more as a whole.
What is the LibXSLT package that I've just written? It's an interface using DLLCC to use the main API of LibXSLT in VisualWorks smalltalk. Want to know more about libxslt? visit here.
There are two ways you can use this package. You can talk directly to the library or you can use my Processor object. I recommend the processor object 'cause it makes every thing nice and simple. So I'll only give examples for that.
XML.XSLT.Processor new stylesheetString: self myXSLT transformXmlString: self myXML
That's the basics of it, you can also use files instead of strings and for transformation you can use a DOM tree as well (it just turns it back in to a string and calls the string version). It'll give you back some transformed XML in string form that you can then parse yourself again, write to a file, whatever you want.
It's that simple! - and it's fast. If you keep a hold of your instance of Processor, it'll keep its handle on the stylesheet you gave it. If you then let the Processor be garbage collected, it'll make sure all the required objects are freed from memory on the C side.
Right, so how was it made. First of all, I couldn't get the DLLCC builder to work. It tried to parse the .h files that I threw at it, but crashed when it hit some GNU extensions or somethinerother. So I ha to give up on that approach quicksmart.
This left me to write the interface manually. It turns out this is really simple. To make a link to a C call in a DLL, you can write code like this:
strcat: str1 with: str2 <C: strcat(char *str1, char *str2)> self error: 'eh? why did that fail?'
Pretty simple really, you can use all the usual C syntax in there for defining the interface. This is effectively a repeat of the .h files information, except in a dynamic environment that can change it any time it wants!
I only ran in to two hiccups for the entire process, so I'll describe those here now.
The first one was that for one of the API's, you give it a string that is a set of key-value pairs separated by ,'s and ='s. But I was passing in an empty string '' which it was crashing on. Without any readily easy way to debug what was going wrong in the C end of things, I couldn't diagnose. Anthony Boris has had a lot more time doing this sort of thing and recognised it straight away, so now I pass in a nil instead of a '', which turns in to a NULL on the C end.
The other one was where the C library would give me a pointer to a string and fill in another pointer to tell me how long it is. How do you model this in Smalltalk? I'll describe more below. I almost had the right answer, but didn't lack the experience to pull it off solo. Again, Anthony Boris came to the rescue, along with Eliot Miranda (the VM guy at Cincom).
Eliot had something interesting to say about this area of Cincom Smalltalk's DLLCC - it's icky.
afunction: buffer len: length <C: afunction(char **buffer, int *length)> self error: 'something went wrong'
That's the interface to it, that's the easy bit. We need to provide the char** and int* to the function call next. This is the hard bit. Then once the call is done, we need to extract the information. And as a final thought, garbage collection? I have the dinstinct impression that I need to be deallocating the string the library has given me, but I'm not sure how..
afunctionwrapper | rc stringBufferPtr stringLenPtr resultString resultSize | stringBufferPtr := CPointerType defaultPointer gcMalloc. stringLenPtr := CIntegerType int gcMalloc. rc := self afunction: stringBufferPtr len: stringLenPtr. rc < 0 ifTrue: [self error: 'call failed']. resultSize := stringLenPtr contents. resultString := ByteArray new: resultSize. stringBufferPtr contents copyAt: 0 to: resultString size: resultSize startingAt: 1. resultString := resultString asStringEncoding: #'utf-8'. ^resultString
And that's how its done. As Eliot said, this should really be much nicer than it is - but at least its possible once you know how.