Subscribe: RSS
  • Hi,

    For various good smiley reasons, I’d like to keep (most of) my sites HTML 4.01 Strict. Currently I’m using TinyMCE as the RTE in MODx, but it only outputs XHTML. Is there any way of changing that?
    Alternatively, is there an RTE that can do that out of the box?

    Thanks!
    • You could try changing the doctype that TinyMCE or FCKEditor uses within the editor:

      http://wiki.moxiecode.com/index.php/TinyMCE:Configuration/doctype
      http://docs.fckeditor.net/FCKeditor_2.x/Developers_Guide/Configuration/Configuration_Options/DocType

      Not sure what effect that would have. You could also experiment with using an XHTML 1.0 Transitional doctype.

      What good reasons are there for continuing to use an HTML 4.01 Strict doctype? I’m not sure why you’re avoiding going to a XHTML doctype. Couldn’t the code for these sites be updated with strict XHTML compliant code? What’s the nature of these sites where it requires a HTML 4.01 Strict doctype? I’m curious.

      The HTML 4.01 standard is over eight years now. The reason why the developers of TinyMCE, FCKEditor and other RTE editors are using XHTML is because that is what is predominately being used now by most web developers and designers. Short of going backwards and using an older version I don’t know how much luck you’ll have finding something that adheres to a HTML 4.01 Strict doctype. To me, doing that is taking a bit of a step backwards.

      Here’s a pretty good article that explains some of the debate surrounding HTML vs XHTML doctypes:

      http://www.elementary-group-standards.com/html/why-xhtml.html

      Honestly, eventually you’ll have to start using the XHTML doctype to adhere to current standards and for accessibility reasons. It’s far better to keep up with the times than it is to be behind and have to quickly learn a buttload of stuff just to get up to speed. Just my two-cents. smiley
        Jeff Whitfield

        "I like my coffee hot and strong, like I like my women, hot and strong... with a spoon in them."
      • Bravado,

        thanks for those hints about configuring the doctype, will try that!

        I’ll try to state the reasons for my preferences regarding (X)HTML without writing a ten-page thesis smiley I am not militant about this subject but I think there’s a lot of ignorance and hype that needs to be addressed.

        Summary:


        • XHTML is not supported at all by any version of IE and won’t be any time soon, according to the IE team
        • serving different MIME types to IE and other UA’s via content negotiation is too much hassle and not bulletproof anyway; also, that’s not enough, as you need to actually alter the document itself (remove non-namespaced classic HTML attributes for XHTML etc. and maintain different versions of your CSS and DOM-manipulating scripts (e.g., root element in XHTML is html, in HTML4 it’s body)
        • even if all UA’s supported XHTML perfectly: the tiniest error will give users the Yellow Screen of Death, which I won’t accept for my clients and their audiences
        • these tiny errors can potentially be caused by too many factors that have nothing to do with my diligence or lack thereof as an author/developer
        • currently, I have no need for MathML or SVG
        • the semantics and tags/attributes are identical for HTML 4.01 and XHTML 1.0, the latter being simply a reformulation into XML of the former -- no more, no less (read the specs if you don’t believe me)
        • XHTML is in no way more "accessible" or "standards-compliant" or <insert fashionably abused term here> than a well-structured, semantically marked up, valid HTML 4.01 Strict document
        • forward-compatibility of XHTML 1.0 is a myth; XHTML 2.0 is a whole different animal, so the notion of quickly and painlessly switching over one day is an utter illusion
        • no mainstream user agent supports XHTML 2.0 today
        • XHTML 1.0 is only one year "younger" than HTML 4.01, so the notion that it’s the latest and greatest thing is also a misconception -- not that age is a valid argument, anyway
        • seeing that real XHTML is not a viable option for me, and seeing that serving XHTML as text/html is absolutely redundant as it’s parsed as "classical" HTML tag soup by UA’s, I choose to feed UA’s real, clean HTML to begin with, skipping their overhead of having to handle tag soup
        • mobile devices handle HTML nicely; if they didn’t they’d exclude the majority of content on the Web; XHTML is not a requirement
        • ... the subject is complicated but to wrap up: simply put, there is no need and no valid reason for me to use XHTML in this day and age

        XHTML just happened to be a part of turn-of-the-millenium web standards advocacy by people such as Zeldman (understandably, the strict syntax offered a good way of forcing authors to abandon 90’s style horror coding). These advocates showed me the light, too, and I’m glad to see standards becoming more and more established. But it’s a grave misconception that web standards are synonymous with XHTML or exclude/forbid HTML 4.01. The W3C expresses no preference for either recommendation.

        HTML 4.01 Strict + CSS + unobtrusive scripting/DOM manipulation + semantic markup + accessibility best practices are fine. Throwing XHTML in the mix does not make it one inch "better".

        Of course, serving XHTM as text/html (as 99% of XHTML sites probably do) according to Appendix C of the XHTML 1.0 spec is not the end of the world. It works, because UA’s are forgiving. So if anyone wants to continue to do so, they should feel free to do so (I might have to, too, now and then in collaborative projects etc). They should, however, be fully aware that there is no benefit over HTML4 and that it’s in no way more "modern" or "standards-compliant" or whatever. As I said in the beginning, I’m not militant about this subject, because right now, for practical purposes, it doesn’t really matter. I just don’t see why I should do things that make no sense.

        I’d like to see people become at least more informed about the markup language they use. That way, they might be less condescending towards people like me who make informed decisions. Bravado, in this case your reply was obviously well-intentioned and friendly (so no sweat smiley) but I have come across technically clueless people who become very haughty and obnoxious when trying to verbally demonstrate their supposed XHTML-"superiority", which I find very sad.

        Sorry, this response turned out to be slightly longer after all... A few years ago, if I’d read these words of mine I’d probably have thought, "What a pedantic loser!", but the fact is, this is about more than just abstract minor pedantics.

        Almost forgot: The article you linked to seems to actually reinforce what I said here and contradict your statements, so I’m a bit confused. It concludes by saying
        This site uses HTML 4.01/Strict utilizing the well-formedness requirements of “XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition): A Reformulation of HTML 4 in XML 1.0.” (excepting those XHTML-specific requirements found Appendix C, i.e., C.1, C.2 and C.3). This decision was based on three unassuming facts.

        * XHTML 1.0 is not forward compatible; XHTML 2.0 will not be backwards compatible.
        * Serving XHTML as application/xhtml+xml doesn’t work in IE.
        * HTML 5 purports backwards compatibility.
        • zendak,
          Well summarized..
          You’re not alone here, I’m +1 for preferring HTML 4.01 (but sometimes due to time/budget constraints, have to go with html that’s handed to me.. included non-compliant table-based stuff.. >:()
          I don’t have a great answer for you at the moment.. but I am currently revisiting the RTE landscape myself. If I come across anything interesting, I’ll post back here.
            Mike Schell
            Lead Developer, MODX Cloud
            Email: mike@modx.com
            GitHub: https://github.com/netProphET/
            Twitter: @mkschell
          • Zendak,

            Great reply! You make some very valid points, enough to make me want to dig in further and learn more about it. I’ve always used the XHTML doctypes simply because I felt like it promoted more well-formed code because of the subtle rules it enforces. But, like you said, no one is using XHTML as it was intended and instead using it more like "extended" HTML instead of "XML" HTML, two distinctly different things. I’m starting to think that going back to a strict HTML doctype is more standardized and compliant simply because you’re using it as it was intended. HTML 5 will likely come along eventually. In the meantime, we can use things like microformats to fill in some of the gaps that are apparent with the use of semantic code.

            Definitely a topic to explore some more though. Thanks for the insight! smiley

            Jeff
              Jeff Whitfield

              "I like my coffee hot and strong, like I like my women, hot and strong... with a spoon in them."
            • I still think there can be some issues with some of the JS libraries out there if not using XHTML doctypes if that’s important to you. I’m probably wrong of course, but that does stand out in the back of my mind.
                Ryan Thrash, MODX Co-Founder
                Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me
              • WOW! A kindered spirit! And rightly so. I haven’t gotten into these debates here but I am a 4.01 advocate and user. Most people use XHTML because it came after 4.01 but that didn’t mean it was intended to replace it. I don’t blame folks for grabbing hold to the "new" XHTML--I did--but then I started reading the HTML spec and following the WSG list and learned the facts you point out, I made the switch and have never looked back (okay a couple of times, read on)

                My own threory as to why XHTML became the defacto "standard" was because Dreamweaver since 3 has pushed a XHTML trans doctype as its default and people were too lazy to learn that it was foolish. Zeldman is a business man and a pragmatist and while I don’t defend him I think that he conributed to the XHTML perpetuation nonetheless.

                I too am a pragmatist and while I want to express the cleanest most well written HTML (4.01) I can, on more than one occassion I have opted for XHTML 1.0 because the 3rd party app would fail if integrated into 4.01 and the client’s budget wasn’t going to pay me to do the necessary core hacking required so X1.0 it was.

                I created a plugin that removes the trailing slash for self closed elements that happens just before the page is rendered.

                As far as RTEs are concerned, I think that you can pretty much do anything you want in terms of customization and code mod with TinyMCE that you’d need to do to stop it from making self closing tags and the other issues that you describe.

                On a side note: @zendak, If you want to help me now convince the desingn/dev world to stop it with transitional it’d be fun. Transition has been around forever and if you have legacy code that requires transitional to render its time to nuke or die. tongue
                  Author of zero books. Formerly of many strange things. Pairs well with meats. Conversations are magical experiences. He's dangerous around code but a markup magician. BlogTwitterLinkedInGitHub
                • One more point from the post you linked to, and this was one I got thinking about recently and that is the i element as a meaningful device. In writing there are all sorts of style guides that point to the use of italics to express differentiation over emphasis such as Sailing Vessels, long fiction, and other such standards. This is not emphasis as I’m not saying its important but that it is a specific thing. I’ve a client that I’ve been working with in the tallships industry and vessels standardly are expressed as italics or all caps. Caps is visually annoying and so I marked them up (originally) with em elements but after reading a number of articles over the last year, I opted to migrate to "I’ because it was more semantic than em.

                  Question everything you do so you know you are doing it because you should not because you always have.

                  Cheers,

                  Jay
                    Author of zero books. Formerly of many strange things. Pairs well with meats. Conversations are magical experiences. He's dangerous around code but a markup magician. BlogTwitterLinkedInGitHub
                  • Did some more digging and have learned quite a bit about this. The main thing I’ve come to find out is that using a XHTML 1.0 Strict doctype isn’t such a bad thing even when you’re using it with a "text/html" MIME type. There are plenty of sites with documentation regarding the use of XHTML, the differences between XHTML and HTML, as well as talk about the upcoming HTML 5 standard. Here’s some of the stuff I’ve been looking at:

                    http://en.wikipedia.org/wiki/XHTML
                    http://www.w3.org/TR/xhtml1/
                    http://www.sitepoint.com/forums/showthread.php?t=393445
                    http://www.alistapart.com/articles/previewofhtml5

                    After reading all of this, I’ve come to the conclusion that I’ll be sticking with a XHTML 1.0 Strict. Although the arguments for why a HTML 4.01 Strict doctype are very much valid, there are a couple of reasons why using a XHTML 1.0 Strict doctype is better:

                    1) XHTML requires that code be well-formed

                    HTML is more forgiving and doesn’t require that you explicitly close tags. Most browsers will render HTML just fine even when everything isn’t perfect with the code. I’m not 100% sure, but I would assume that HTML code that isn’t well-formed could potentially cause issues in scripts that rely on the DOM to function properly.

                    2) HTML is forgiving...almost too forgiving

                    As with most script and programming languages, it’s good practice to use well-written, elegant code. The main reason is that it helps eliminate typical mistakes made that cause errors. It also makes it easier to validate the code. The problem with HTML doctypes is that they’re almost too forgiving. Most of the validators out there allow for simple mistakes in the code. For example, consider the following code:

                    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
                    <html>
                    	<head>
                    		<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
                    		<title>Test</title>
                    	</head>
                    	<body>
                    <p>This is a test paragraph
                    <p>This is another one
                    	</body>
                    </html>
                    


                    This is perfectly legitimate HTML code and if you run in through the W3C Markup Validator (http://validator.w3.org/) you’ll see that it passes with flying colors. But what about those paragraph tags? Shouldn’t they be properly closed? Would code like this cause issues with being able to properly parse the DOM for the paragraphs? What about search engines, spiders, document readers for the blind? To me, this is a sloppy way to code and doesn’t promote the kind of standardized code that’s possible with XHTML.

                    3) XHTML is well adopted

                    Although the use of XHTML with a "text/html" MIME type might seem like a bad use of the format, it’s so engrained now that to go back to HTML 4 would be a bit of a step back. The use of XHTML as HTML isn’t a documented standard per se, it’s more of a standard that came to be out of necessity. No one can argue that HTML 4 is a well documented standard. The problem though is that it’s also a stale standard. The whole idea behind the "X" in XHTML was that it was "eXtensible" HTML that could be parsed either as strict XML or HTML. As such, developers flocked to it because of the promise it had. Yes, it’s true that XHTML isn’t supported on Internet Explorer...but that only applies to strict XML MIME types, not HTML.

                    4) Slow progress of standards

                    The necessity for using XHTML is such I think because it’s unclear exactly when the HTML 5 standards will be finalized. No one knows anything about what is going on. There is entrenchment in the web standards community about the direction HTML 5 should take. The W3C is saying one thing, WHATWG saying another, with the Web Standards Project putting their two cents in as well. The end result is that we probably won’t see HTML 5 being put into a release candidate state until probably 2012.

                    Based on all this, I’m leaning more on the continued use of XHTML. I use script libraries like MooTools and jQuery pretty heavily and I just don’t like the idea of getting something really screwed up simply because the HTML I’m writing doesn’t get validated properly. For me, it’s all about well-formed code and the ability to properly parse the DOM. Not that that isn’t possible with a strict HTML doctype, but I think most of the tools we use are more geared for XHTML validation.

                    Well, that’s my two cents. smiley
                      Jeff Whitfield

                      "I like my coffee hot and strong, like I like my women, hot and strong... with a spoon in them."
                    • Quote from: Bravado at Jun 27, 2008, 03:38 PM

                      The necessity for using XHTML is such I think because it’s unclear exactly when the HTML 5 standards will be finalized. No one knows anything about what is going on. There is entrenchment in the web standards community about the direction HTML 5 should take. The W3C is saying one thing, WHATWG saying another, with the Web Standards Project putting their two cents in as well. The end result is that we probably won’t see HTML 5 being put into a release candidate state until probably 2012.

                      Maybe THAT’S what the Mayans meant long ago when they predicted the end to the world on Dec 21, 2012. tongue
                        Ryan Thrash, MODX Co-Founder
                        Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me