Thanks for your feedback and useful points, everybody! Sorry, I’ve been busy in the offline world the last couple of days..
First off, I’ve solved my original problem using a plugin that does a bit of string replacement before templates get rendered (probably similar to what smashingred mentioned). That takes care not only of RTE’s but also any other 3rd party code. Basically it boils down to removing the slash on a few "empty"/self-closing tags (meta, img, br etc.), which is quick & easy and with no noticable performance hit. The rest consists of making sure I write good HTML in my templates, snippets and so on.
Some comments to your replies:
I’m not trying to say that it’s horribly evil to do "fake" XHTML (sent as text/html and with the other slur of compromises mentioned earlier), so of course it’s
okay to do that if you prefer. All I’m saying is, be very clear about the fact that you
are sending HTML to the browsers. The MIME type is what’s important, not your DOCTYPE. So if you’re not really using "real" XHTML because you need its features, why do it at all? You can have the same "web standards benefits" using strict, valid HTML4. But again, I’m not militant about this, I agree that you need to be pragmatic sometimes and flow with project requirements beyond your control and so on. Besides, this whole discussion comes up every so often and always brings a "uh oh, here we go again" sigh to many people. And the last thing I wanna do is annoy people with endless nitpicking
I’m just asking that people at least understand the issues. I know it’s tedious to do the research and there are more thrilling things to do than compare fine points of dry specs, but hey that’s part of the job as a pro.
Bravado, I appreciate your line of thinking, i.e. that XHTML seems to encourage more rigid coding practices, which is a good thing. A while ago I was at exactly that spot, asking myself the things you did. But consider this: "Fake" XHTML is still just plain HTML for UAs. Actually, it’s invalid HTML, sort of ironic actually. Strict HTML4 itself is very much non-forgiving when it comes to
validation. So it’s not HTML that’s "loose" or forgiving but the browsers’ handling, which is a different matter. Just remember that browsers will be just as forgiving if you write sloppy fake XHTML. Write sloppy real XHTML and your
users are screwed.
Your code sample is valid. Closing p tags are optional indeed for HTML4. Many people including me prefer to close them anyway, for optical orientation. What I do is basically stick to all the best practices that are required for XHTML in HTML4. I close all tags, even when optional, I don’t use the short form of boolean attributes (e.g. <option selected> vs. <option selected="selected">) and a few other things. That way, I have a consistent coding style which is one of the concerns you touched upon.
Script libraries: It would be unwise if they didn’t work correctly with proper HTML4. I’ve only really used YUI (look at what DOCTYPE they use in all their examples) and jQuery so far and no problems have come up at all. Actually, if I’m not mistaken, innerHTML is not supported by real XHTML, and a lot of JS trickery seems to be based on innerHTML. Real XHTML requires namespace-aware DOM methods like createElementNS. Do current libraries even handle that? Not sure. Do I want to write two versions of all my scripts? Nope.
Who knows what the future will hold (and how soon it will be here), (X)HTML5 looks promising, although there’s still work to be done and a controversial thing or two in there at this stage. Personally though, I think I’d rather bet on that than XHTML 2.0 Especially making consistent error handling for browsers part of the (X)HTML5 spec is a brilliant idea. But let’s wait and see. Nice Maya astrology reference, btw. Hey, perhaps this issue will even be the catalyst for the end of the current
Kali Yuga
Just saw today that James Bennett coincidentally wrote about this stuff in a much clearer way than I’ve managed here, a good read for anyone still confused.
Read in this order:
Finally, to quote from the excellent
Sitepoint FAQ that Bravado linked to (emphasis mine):
Should I use XHTML or HTML?
That depends on who you ask. There are a number of technical issues with this question, which preclude a simple and short answer. In reality, the latest W3C recommendation with widespread support is HTML 4.01. Unless you actually need any of the features that XHTML offers over HTML, there is no technical reason to use XHTML.
In order to actually benefit from using XHTML, you really need to understand the fundamental differences between XHTML and HTML. Such a site will only be available to a small minority of the surfing population, however.
Some web designers and developers prefer XHTML’s syntax rules over HTML’s. By following certain guidelines, you can use this syntax without technically using XHTML at all (see below). There are a number of potential pitfalls with this approach, but it is a possible way forward for those who absolutely want to type
instead of
.
For ’future-proofing’ your documents, using a Strict doctype is more important than whether you use XHTML or HTML.