We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
  • Quote from: olafmol at Feb 11, 2009, 02:04 PM

    Our developers looked into it some more, it seems that when installing MODx with the installer, you can choose the character set encoding, and it’s set correctly in the $database_connection_charset variable in the config.php, but the creation of the tables themselves don’t force the chosen collation on the tables, and thus they are always the default mySQL collation (latin1_swedish_ci’). So it seems in the installation script the table create statements have to be extended with a character_set and collation setting (http://dev.mysql.com/doc/refman/5.0/en/charset-table.html). If this is done correctly the MODx tables will have a correct UTF8 collation, the config.php setting will also be corresponding to UTF8 and thus everything will work ok.
    It is done correctly, and exactly as designed. If you let MODx create the database container it creates it with the correct collation and the tables get the same. If you create it manually, you are responsible for making sure the collation matches what you want the tables to get, and they will always match the database container. If we forced them to utf8_general_ci, no one would be able to use them with another charset or collation. It just takes a little diligence to make sure you database is not created with MySQL’s wonderful default, latin1_swedish_ci.
      • 22098
      • 218 Posts
      Quote from: OpenGeek at Feb 11, 2009, 02:32 PM

      Quote from: olafmol at Feb 11, 2009, 02:04 PM

      Our developers looked into it some more, it seems that when installing MODx with the installer, you can choose the character set encoding, and it’s set correctly in the $database_connection_charset variable in the config.php, but the creation of the tables themselves don’t force the chosen collation on the tables, and thus they are always the default mySQL collation (latin1_swedish_ci’). So it seems in the installation script the table create statements have to be extended with a character_set and collation setting (http://dev.mysql.com/doc/refman/5.0/en/charset-table.html). If this is done correctly the MODx tables will have a correct UTF8 collation, the config.php setting will also be corresponding to UTF8 and thus everything will work ok.
      It is done correctly, and exactly as designed. If you let MODx create the database container it creates it with the correct collation and the tables get the same. If you create it manually, you are responsible for making sure the collation matches what you want the tables to get, and they will always match the database container. If we forced them to utf8_general_ci, no one would be able to use them with another charset or collation. It just takes a little diligence to make sure you database is not created with MySQL’s wonderful default, latin1_swedish_ci.

      you’re right about the installation, i will fire our developer wink
      • LOL

          Ryan Thrash, MODX Co-Founder
          Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me
          • 36416
          • 589 Posts
          Quote from: olafmol at Feb 11, 2009, 03:07 PM

          you’re right about the installation, i will fire our developer wink

          Yeah, I fired myself on at least two sites... tongue
            • 5699
            • 46 Posts
            computersolutions.cn Reply #15, 15 years, 3 months ago
            Was having this issue myself today, although phpMyAdmin says that the collation _is_ utf8_collation

            Tried following the instructions here - http://hexmen.com/blog/2008/07/mysql-latin1-utf8-wordpress-upgrade/ and exported/ imported to another db name for testing, and amended modx to use the new db. That seemed to work for 961, but in .963 I just get garbage out.

            If I go back to 961 its ok.

            Before I did the export to another db, adding connection_charset=’utf8’ would give me garbage results (utf8 mangled to death).
            After exporting the db to another DB I can use this and its ok:
            $database_connection_charset = ’utf8’;
            $database_connection_method = ’SET NAMES’;
            ..but only in 961

            However 0.963 returns to show garbage, even if comment out the connection_method.

            Any idea’s?

            I know its *got* to be encoding related, but i’m at a loss to see where. phpMyAdmin claims its all utf8, etc!
              • 22098
              • 218 Posts
              Quote from: computersolutions.cn at Feb 12, 2009, 06:51 AM

              Was having this issue myself today, although phpMyAdmin says that the collation _is_ utf8_collation

              Tried following the instructions here - http://hexmen.com/blog/2008/07/mysql-latin1-utf8-wordpress-upgrade/ and exported/ imported to another db name for testing, and amended modx to use the new db. That seemed to work for 961, but in .963 I just get garbage out.

              If I go back to 961 its ok.

              Before I did the export to another db, adding connection_charset=’utf8’ would give me garbage results (utf8 mangled to death).
              After exporting the db to another DB I can use this and its ok:
              $database_connection_charset = ’utf8’;
              $database_connection_method = ’SET NAMES’;
              ..but only in 961

              However 0.963 returns to show garbage, even if comment out the connection_method.

              Any idea’s?

              I know its *got* to be encoding related, but i’m at a loss to see where. phpMyAdmin claims its all utf8, etc!

              did you also check if the *table* collations themselves are utf8?
                • 5699
                • 46 Posts
                computersolutions.cn Reply #17, 15 years, 3 months ago
                Yes, I have, and they look correct - eg -

                Database -
                wpi2 utf8_general_ci

                all tables show utf8_general_ci
                [tt]modx_site_content Browse Structure Search Insert Empty Drop 5,196 MyISAM utf8_general_ci 2.0 MiB -
                [/tt]

                inside - all columns look utf8_general_ci
                [tt]Field Type Collation Attributes Null Default Extra Action
                id int(10) No auto_increment Browse distinct values Change Drop Primary Unique Index Fulltext
                type varchar(20) utf8_general_ci No document Browse distinct values Change Drop Primary Unique Index Fulltext
                contentType varchar(50) utf8_general_ci No text/html Browse distinct values Change Drop Primary Unique Index Fulltext
                pagetitle varchar(255) utf8_general_ci No [/tt]


                All looks good, but when I try with 963, i get garbage. 961 its ok.
                • Try changing your config to
                  $database_connection_method = 'SET CHARACTER SET';



                    Ryan Thrash, MODX Co-Founder
                    Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me
                  • Most likely it was not working in 0.9.6.1 properly and the data was being converted before being stored in the proper tables. Unfortunately, the only thing you can do is find a way to convert the data if it’s stored wrong and now being handled properly by the mysql client api.
                      • 5699
                      • 46 Posts
                      computersolutions.cn Reply #20, 15 years, 3 months ago
                      I’ve just tried this again with a brand new install (vs upgrade), a new database, and notice that the database collation appears to default to latin, if created in phpMyAdmin, despite the default being set to utf8.

                      I’ve changed the database collation in phpMyAdmin, re-ran the install on the newly created db and it seems to be happy now.
                      However, the original reason I was doing this was because I wanted Alias’s to be working in Chinese.

                      This still doesn’t appear to be working in 963.

                      Collation is utf8_general_ci, db is same, tables are same.

                      This was allegedly fixed according to what I’ve read in the forums, any clues?