Dare Obasanjo talks about the specs and reality, and the variances thereof, in the encoding of xml docs on the web:
All files are sent with a content type of text/xml and no encoding specified in the charset parameter of the Content-Type HTTP header. According to RFC 3023 which Mark Pilgrim quoted in his article that clients should treat them as us-ascii. With the above examples this behavior would be wrong in all four cases.
He then goes on the list the way a client application actually needs to deal with this conundrum - check for:
- the encoding given in the charset parameter of the Content-Type HTTP header, or
- the encoding given in the encoding attribute of the XML declaration within the document, or
Which is what I stumbled on for BottomFeeder awhile back. I wish Dare had posted this back when I was stumbling in the dark :)