Send to Printer

development

Why client side strictness is wrong

January 14, 2004 8:49:19.712

Mark Pilgrim explains in complete detail why insisting that clients reject invalid xml out of hand is a bad idea. The best part is, as I write this, he has four perfect examples:

Norman Walsh (invalid XML), Danny Ayers (invalid XML), Brent Simmons (invalid XML), Nick Bradbury (invalid XML), and Joe Gregorio (invalid XML claiming to be HTML) have all denounced me as a heretic for pointing out that, perhaps, rejecting invalid XML on the client side is a bad idea. The reason I know that they have denounced me is that I read what they had to say, and the reason I was able to read what they had to say is that my browser is very forgiving of all their various XML wellformedness and validity errors.

All of those folks have been extremely insistent that aggregators should reject bad content out of hand. All of those folks have invalid feeds on that basis right now. The lot of them need to read the rest of Mark's post; they should consider what he has to say carefully. Here's how I checked the validity myself:

I used this code to check whether a feed was valid xml - it just grabs the source xml, and tries to parse it - without handling any errors

source := 'http://inessential.com/xml/rss.xml'.
parser := XMLParser new.
parser validate: false.
doc := (HttpClient new get: source) contents.
^parser parse: doc

I used this snippet of BottomFeeder code to make sure that the invalid xml was getting handled:

doc := Constructor  
	documentFromURL: 'http://bitworking.org/index.rss' 
	forceUpdate: true 
	useMaskedAgent: false.
cls := Constructor determineClassToHandle: doc.
target := cls objectForData.
feed := cls 
	processDocument: doc
	from: 'http://bitworking.org/index.rss' 
	into: target.

In each case, the simple parse fails with an error, while the BottomFeeder framework deals with the error and produces the content I want to see.

Comments

Why client side strictness is wrong

[Danny] January 14, 2004 10:13:28.662

My comment seems to have been eaten - all I said in essence was that I was expressing a view about Atom, not RSS.

What invalid feed?

[Norman Walsh] January 16, 2004 14:35:10.968

I was, when Mark wrote that entry, serving invalid XHTML pages. My bad. That's fixed now. I do not believe that I was ever serving an invalid RSS feed. If you think one of them is invalid, I would definitely like to know, so please tell me. On the subject of the invalid XHTML, I must point out that my server sends the pages with a content type of text/html, so they aren't actually required to be valid XHTML. And finally, none of what Mark says moves me in the slightest.

 Share Tweet This