I was just searching if there is already a report of this problem
Created a page "foo & bar" that failed the XHTML validation because of these unescaped "
&" char. It’s nice to have it stored in the DB as it is, but it has to be escaped before the output.
This function might be of use, used it on my old homepage before movind to MODx:
<?php
/**
* Escape a string
* @param string
* @param string
* @return string
*/
function escape_string($string, $type="html") {
switch($type) {
case 'html':
//Convert special characters to HTML entities
return htmlspecialchars($string, ENT_QUOTES);
case 'full_html':
//Convert all applicable characters to HTML entities
return htmlentities($string, ENT_QUOTES);
case 'quotes':
//Escape unescaped single quotes
return preg_replace("%(?<!\\\\)'%", "\\'", $string);
case 'url':
//URL-encode according to RFC 1738
return rawurlencode($string);
case 'javascript':
//Escape backslashes, quotes, newlines, etc.
return strtr($string, array('\\' => '\\\\', "'" => "\\'", '"' => '\\"', "\n" => '\\n', '</' => '<\/'));
default:
return $string;
}
}
?>
It can convert only special chars, or all chars (like Umlaute for german language), encode an URL, etc.
Smarty template engine has a more complete function (plugin) that can also escape chars into hex and handle non-standard chars.
This is probably the most needed type:
http://www.php.net/manual/en/function.htmlspecialchars.php
If you need that above function, you are welcome to use it.
Boby