-
- 9 Posts
Hello,
I tried to build a website in Slovak language on MODx. I have a problem with non-ascii characters, which are automatically converted to HTML entities before storing to the database. The frontend shows them well, but the backend shows them as HTML entities, for example in site menu tree or Document settings / General / Title, Long title etc. it shows & #318; instead of "ľ", so the text is unreadable for the editor. In the database they are also stored as entities - & uacute; instead of "ú", etc. This happens independently on using or not using WYSIWYG editor.
I use Character Encoding: Unicode - unicode.
I tried to set up many different encodings, but the result was the same.
Is there any way how to turn this feature off for non-english languages?
I use
MODx 0.9.1 rev 646
PHP Version 5.0.2
Mozilla Firefox 1.0.7
MySQL 4.1.5-gamma
Mandrakelinux 10.1
If you don’t get a satisfactory answer soon, please file this as a bug report... thanks!
Ryan Thrash, MODX Co-Founder
Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me
Did you change the encoding in MODx (see Site Administration --> Site Settings) to match the DB encoding you have set? Did you set the DB encoding properly? The default encoding in MySQL is usually different depending on what version you have and how you installed it. Check these to make sure they match each other, and double-check to make sure your templates include the appropriate HEAD tag to indicate the encoding you are trying to use.
Your head tags should use [(etomite_charset)] to specify the character set to use so it matches the site setting. The parser will override some head tags, which can cause validation problems.
The HTTP response header is also set to the value of the [(etomite_charset)] configuration setting, so keep that in mind as well.
That’s what I meant. Sorry if I was unclear.
Actually Susan, there are two different things -- one is a META tag describing the content-type and encoding which you can control, the other is an HTTP header (set using the PHP command header()) that is set before the page is delivered to the client by the parser, and this is using the configuration setting to describe the charset.
-
- 7,075 Posts
Yeah I’ve
found that out recently.
HTTP headers always override meta tags. They can be set by editing php.ini and httpd.conf though.
I also noticed accented or special characters were turned into html entities, no matter how the charset is set in MODx.
As you said the culprit might be MySQL charset, when the record is inserted special characters get converted into html entities, but most of the time (shared hosting) you’ll have no control over this. I had the same issue with Textpattern, e107 and Drupal... I am not sure something can be done. I’ll try to toy with MySQL character encoding on my local server, if I have some time...
.: COO - Commerce Guys - Community Driven Innovation :.
MODx est l'outil id
-
- 9 Posts
I tried this:
1. I set up a correct utf8 collation in MySQL for modx_site_content table (utf8_slovak_ci - may be also utf8_general_ci or any other utf8_*, it doesn’t matter)
2. I set System configuration > character encoding to Unicode-UTF8
3. I replaced a "iso-8859-1" by "[(etomite_charset)]" in the site template
4. I inserted this line:
mysql_query("SET CHARACTER SET utf8");
on line 63 - after "$this->isConnected = true;"
in the manager/includes/extenders/dbapi.mysql.class.inc.php
5. I edited the text in modx_site_content directly in phpMyAdmin
With this hack the text corectly inserted via phpMyAdmin is displayed correctly on the frontend, but in manager it is still scrambled. When I edit it via backend manager, it is stored scrabled to the database and the frontend show it scrambled the same way as it is in the database. So IMHO the problem is a wrong character set interpretation in both ways (MODx to MySQL and MySQL to MODx).