Mar 09 2008 at 7:00pm
Is serving xhtml as html really such a crime?
I have a new article up at A Padded cell about choosing a doctype for your site. The article is mainly intended for people who are just starting to work with web standards. It is a complicated problem so I tried to keep it simple. I did think long and hard about whether to recommend an html or xhtml doctype. I decided to go with xhtml mainly because it encourages better coding habits.
The W3C has made it clear that it’s okay to serve xhtml as html. I came upon this several times in my research:
From the xhtml spec:
XHTML Documents which follow the guidelines set forth in Appendix C, “HTML Compatibility Guidelines” may be labeled with the Internet Media Type “text/html” [RFC2854], as they are compatible with most HTML browsers.
From the HTML and XHTML Frequently Answered Questions:
Why is it allowed to send XHTML 1.0 documents as text/html?
… However XHTML 1.0 was carefully designed so that with care it would also work on legacy HTML user agents as well. If you follow some simple guidelines, you can get many XHTML 1.0 documents to work in legacy browsers. However, legacy browsers only understand the media type text/html, so you have to use that media type if you send XHTML 1.0 documents to them. But be well aware, sending XHTML documents to browsers as text/html means that those browsers see the documents as HTML documents, not XHTML documents.
From the xhtml recommended media type usage:
The ‘text/html’ media type [RFC2854] is primarily for HTML, not for XHTML. In general, this media type is NOT suitable for XHTML. However, as [RFC2854] says, "[XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html".
Most if not all of the warnings against serving xhtml as html are only valid if you decide to go back and change the mime type to xhtml if/when it is supported. Who is going to do that?
I was a bit hesitant because it is clear that serving xhtml as html isn’t the ideal situation. However, for me this was overruled by the extra strictness in xhtml. If new users only used html they wouldn’t learn that you need to write in lower case, close all your tags, quote attributes etc.. Xhtml requires you to develop better coding habits and, even as an experienced user, I like the extra level of detail from the validator.
I really wish they would consider adding these requirements to the html 5 spec. It’s kind of ridiculous that omitting start and/or end tags for such vital elements as <html>, <head>, and <body> tags is permitted.
Can anyone help me find where, on the W3C site, they list the elements where start/end tags are optional? I came across it at one point during my research for that article but now I can’t find it again.


