I’ve been having some trouble with character encodings when dealing with the modx site contents from the database (rather than the manager/frontend), and I think I sorted it out.
There were several causes (for example, my own terminal was not UTF8 capable), but more importantly, I found that the variable
$database_connection_charset on the manager/includes/config.inc.php is blank after installation.
This leads MODx to use the default system encoding for the connection, and when this encoding is different from the one chosen for the MODx database, the data is inserted into the database with the system encoding rather than the MODx encoding.
For example, my system was latin1 but MODx was UTF8. All looked normal from the manager/frontend, but there was no way I could display diacritical characters on the mySQL client (using UTF8 encoding, naturally). What happens was that that data was stored in latin1 encoding! The reason everything was OK in the manager/frontend was just because the data was being sent and retrieved using the same encoding.
After toying around with several installations, the culprit was found in the $database_connection_charset variable. I have seen this topic
(
http://modxcms.com/forums/index.php/topic,10312.0.html) and this topic (
http://modxcms.com/forums/index.php/topic,15170.msg99529.html#msg99529) which would lead me to think that this problem should not exist anymore. If so, this seems like a bug in the installation process. By the way, during installation, I was never asked for the charset, only the collation - maybe that is related to this.
Should I file a bug?
regards,
joão