We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
  • Would need to know a lot more details about your MySQL version, PHP version and webserver. It works here exactly as expected.
      Ryan Thrash, MODX Co-Founder
      Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me
      • 5699
      • 46 Posts
      computersolutions.cn Reply #22, 15 years, 3 months ago
      [tt]MODx version 0.9.6.3
      Version codename rev 4565
      phpInfo() View
      Access permissions Enabled
      Server Time 01:30:54
      Local Time 01:30:54
      Server Offset 0 h
      Database name citrix
      Database server localhost
      Database Version: 5.0.32-Debian_7etch8-log
      Database Charset utf8
      Database Collation Charset utf8_general_ci
      Table prefix modx_[/tt]

      PHP Version 5.2.0-8+etch13
      Apache2.

      I can give you a login to check if need be.

        • 3749
        • 24,544 Posts
        One more thing to check: On the PhpMyAdmin home screen, click on Show MySQL system variables. It will give you more detail about the character set(s) being used.
          Did I help you? Buy me a beer
          Get my Book: MODX:The Official Guide
          MODX info for everyone: http://bobsguides.com/modx.html
          My MODX Extras
          Bob's Guides is now hosted at A2 MODX Hosting
          • 5699
          • 46 Posts
          computersolutions.cn Reply #24, 15 years, 3 months ago
          phpadmin shows this for server vars (relevant bits)

          [tt]character set client utf8
          (Global value) latin1
          character set connection utf8
          (Global value) latin1
          character set database latin1
          character set filesystem binary
          character set results utf8
          (Global value) latin1
          character set server latin1
          character set system utf8
          character sets dir /usr/share/mysql/charsets/
          collation connection utf8_unicode_ci
          (Global value) latin1_swedish_ci
          collation database latin1_swedish_ci
          collation server latin1_swedish_ci[/tt]

          mysql -u root -p shows:
          show variables like "%character%";
          +--------------------------+----------------------------+
          | Variable_name | Value |
          +--------------------------+----------------------------+
          | character_set_client | latin1 |
          | character_set_connection | latin1 |
          | character_set_database | latin1 |
          | character_set_filesystem | binary |
          | character_set_results | latin1 |
          | character_set_server | latin1 |
          | character_set_system | utf8 |
          | character_sets_dir | /usr/share/mysql/charsets/ |
          +--------------------------+----------------------------+


          Getting somewhere, I’ll edit the my.cnf and retry. "Perfect" UTF-8 seems to be an impossible goal hehe
            • 3749
            • 24,544 Posts
            Tell me about it. Sometimes I think we should recommend doing everything in Latin1 unless language requirements absolutely dictate otherwise, especially for people who have limited access to their servers. Virtually every person I’ve seen struggling is trying to use utf-8.
              Did I help you? Buy me a beer
              Get my Book: MODX:The Official Guide
              MODX info for everyone: http://bobsguides.com/modx.html
              My MODX Extras
              Bob's Guides is now hosted at A2 MODX Hosting
              • 5699
              • 46 Posts
              computersolutions.cn Reply #26, 15 years, 3 months ago
              I actually *need* UTF-8 (most of our sites are in simplified Chinese)

              Going to try this - http://www.saiweb.co.uk/mysql/mysql-forcing-utf-8-compliance-for-all-connections

              [tt]
              [mysqld]
              init_connect=’SET collation_connection = utf8_general_ci’
              init_connect=’SET NAMES utf8′
              default-character-set=utf8
              character-set-server=utf8
              collation-server=utf8_general_ci [/tt]

              and see if any of the data gets borked (will try without the init_connect lines first)

              PhpMyAdmin post edit (sans init_connects) shows:

              character set client utf8
              character set connection utf8
              character set database utf8
              character set filesystem binary
              character set results utf8
              character set server utf8
              character set system utf8
              character sets dir /usr/share/mysql/charsets/
              collation connection utf8_unicode_ci
              (Global value) utf8_general_ci
              collation database utf8_general_ci
              collation server utf8_general_ci

              So that looks good, and the data still looks ok.

              Unfortunately setting an Alias still doesn’t work for utf8 characters.

              Any other idea’s?
                • 3749
                • 24,544 Posts
                You definitely need utf-8. smiley

                Just remember that if the *data* in the tables is not in utf-8, it probably won’t work. Also, I’ve read that

                ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name;

                is not completely reliable.

                I used to have a standalone Windows program that was supposed to convert the data in the SQL dump file, which you could then import. I’m not sure if it was reliable either.

                I’ve also read horror stories about people who printed out the whole database on paper, marked the bad characters, and changed them by hand in the DB. I wouldn’t want to try that with Chinese, though.

                Good luck. smiley
                  Did I help you? Buy me a beer
                  Get my Book: MODX:The Official Guide
                  MODX info for everyone: http://bobsguides.com/modx.html
                  My MODX Extras
                  Bob's Guides is now hosted at A2 MODX Hosting
                  • 5699
                  • 46 Posts
                  computersolutions.cn Reply #28, 15 years, 3 months ago
                  Data in tables is in UTF-8.
                  field collation is UTF-8
                  Table collation is UTF-8
                  Database collation is UTF-8

                  Connection is UTF-8

                  *utf-8/ UTF-8_general_ci

                  This is a new install of 0.9.63 (not an upgrade)
                  Install was specifically done to test aliases in utf8 (simplified chinese)

                  Ideally I’d like to get the chinese aliases working, as we have an SEO related project I *need* utf-8 aliases for.

                  Is everyone 100% sure that utf8 works for aliases?

                  Can someone try adding 北京 as an alias in a page on a working system, and see if it saves it, or returns with alias empty post save.
                  Thanks!

                  I’m willing to troubleshoot more, and provide logins for Ryan or similar to see what the issue is.
                  Its our own server, so I can configure as I want (within reason, we also have some live stuff on that server too!)

                  Cheers!

                  Lawrence
                  • Don’t you have to use utf8_unicode_ci to store those Chinese characters in MySQL?  I could be wrong, but I don’t think general_ci is the appropriate collation to have chosen.
                      • 5699
                      • 46 Posts
                      computersolutions.cn Reply #30, 15 years, 3 months ago
                      @OpenGeek - I think you’re wrong on that.

                      Collations are using for sort ordering.
                      Unless there is a specific need for different sort ordering - eg in Hebrew or Vietnamese with diacritics, or Swedish, then it doesn’t matter what collation method you use (utf8_general_ci is faster though).

                      Thats from what I can infer from this:

                      http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
                      and this
                      http://forums.mysql.com/read.php?103,187048,188748#msg-188748

                      I guess I’ll have to go through the code myself, and see where the Alias stuff is breaking. As far as I can see, its all good on a database and connection level from what I can see.

                      [Edit]

                      I’ve found where the utf8 disappears - its in the StripAlias() function in manager/processors/save_content.processor.php

                      Now that I’ve checked the code, it makes sense that it doesn’t work, as its broken!

                      [tt]$alias = preg_replace(’/[^\.%A-Za-z0-9 _-]/’, ’’, $alias); // strip non-alphanumeric characters[/tt]

                      That line essentially strips out anything non A-Z, which in my case means everything I’m actually using, doh!. So, I think its safe to say in 0.963 you *dont* support UTF8 in the Aliases (unless its utf8 encoded a-Z 0-9).

                      Should I open up a bug ticket? Or is this intentional?