We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
    • 6726
    • 7,075 Posts
    I upgraded nodeo.net everything went fine except for this character set issue in the backend.
    The value for $database_connection_charset has always been utf8 in config.inc.php.

    Edit : my bad, I checked in phpMyAdmin the collation of my database is in latin1_swedish_ci ! 
    I’ll re-run the installer in advanced mode. Or can I change the db collation manually ?

    Edit 2 : I re-run the installer since I don’t know where I can reset the database collation (not in config.inc.php huh ).
    I choose the right collation this time (latin1_swedish_ci) and still the appropriate connection charset which is utf8.

    Only problem is, the config.inc.php was written this way by the installer

    $database_connection_charset = '';
    $database_connection_method = 'SET CHARACTER SET';


    Thus when I logged into MODx after running the upgrade it looked like nothing had been fixed.
    I just changed $database_connection_charset = ’’; to $database_connection_charset = ’utf8 ’; and now everything is fine smiley

    Would there be a problem when the connection charset is not "aligned" with the collation (asking a dumb question since I understand nothing at all this !!!) ?


      .: COO - Commerce Guys - Community Driven Innovation :.


      MODx est l'outil id
    • First, this is incorrect. If your database collation is latin1_swedish_ci, then your charset needs to match, i.e. latin1, which means you need to be presenting data in a latin1 compatible charset in the website (front-end and back-end). What is happening is, since you specify utf8 in the connection_charset value, the database connections are initialized using SET CHARACTER SET utf8. This forces the MySQL client API to convert result sets and string values in the queries to utf8 when coming out of the latin1 database, and back again when saving data into the database. This can be lossful, depending on the characters being used. This works because SET CHARACTER SET uses the actual database collation (latin1_swedish_ci) to set the client connection charset to latin1...
      A SET CHARACTER SET x statement is equivalent to these three statements:

      SET character_set_client = x;
      SET character_set_results = x;
      SET collation_connection = @@collation_database;

      Setting collation_connection also sets character_set_connection to the character set associated with the collation (equivalent to executing SET character_set_connection = @@character_set_database). It is not necessary to set character_set_connection explicitly.
      So regardless of what you pass, the collation_connection is always going to match the actual database. This is a good thing.

      But this is where the database connection method can change this behavior; by using SET NAMES instead of SET CHARACTER SET, you can force the character_set_connection to utf8...
      A SET NAMES ’x’ statement is equivalent to these three statements:

      SET character_set_client = x;
      SET character_set_results = x;
      SET character_set_connection = x;

      Setting character_set_connection to x also sets collation_connection to the default collation for x. It is not necessary to set that collation explicitly. To specify a particular collation for the character sets, use the optional COLLATE clause:

      SET NAMES ’charset_name’ COLLATE ’collation_name’
      Here you can control the character_set_connection explicitly; now the MySQL client will treat the database connection in the forced charset definition. IMHO, this is not at all desirable, and gets extremely confusing.

      My rule of thumb, always use UTF-8 charset on the web site and always use utf8/utf8_unicode_ci when creating a MODx database (the install will do this automatically if you have privileges to create databases). It avoids all the confusion over what is being converted where, which has lead many people down the wrong path trying to solve these issues.

      BTW, you cannot change the collation on existing tables without corrupting your data. That’s why it is important to always get this right from the beginning, or you will end up recreating all of your content, or at least fixing a lot of it either manually or via a bunch of complex and scary steps in MySQL.
      • Shouldn’t a plain upgrade "just work" though?
          Ryan Thrash, MODX Co-Founder
          Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me
          • 6726
          • 7,075 Posts
          Thanks Jason for the explanation.
          I wasn’t questionning MODx but the problem I was facing.

          I might have added that the database was created back when I was hosted on a shared server, with MODx 0.9.1 and has been upgraded to each new MODx rev. This might explain why my DB collation is not set as ut8_general_ci which is the default collation for my current dedicated server.

          I now understand I have to set database connection to latin1 for this website (and other old shared server installs which have been migrated to my dedicated) and that I can’t change that and convert to utf8 for older installs which have been set at latin1 upon creation.

          Other websites I installed later have a utf8_general_ci collatin and utf8 connection charset and no such issues occur.
            .: COO - Commerce Guys - Community Driven Innovation :.


            MODx est l'outil id
          • This is a long standing issue for a lot of less technical users, and even many developers don’t fully grok the MySQL client connection stuff (I know I don’t, and why I try to avoid anything but utf8_/UTF-8 all the way even if I have to convince someone to convert their content separately). I for one am at a loss as to how to provide the flexibility to set all these values with all the knowledge you need to cope with using other charsets.

            But, as for the real problem, I’m getting confused. Is the problem that the regular or advanced upgrade set the database_connection_charset variable to an empty string in the config file?
            • Quote from: OpenGeek at Oct 15, 2008, 05:50 PM

              ...by using SET NAMES instead of SET CHARACTER SET, you can force the character_set_connection to utf8...
              A SET NAMES ’x’ statement is equivalent to these three statements:

              SET character_set_client = x;
              SET character_set_results = x;
              SET character_set_connection = x;

              Setting character_set_connection to x also sets collation_connection to the default collation for x. It is not necessary to set that collation explicitly. To specify a particular collation for the character sets, use the optional COLLATE clause:

              SET NAMES ’charset_name’ COLLATE ’collation_name’
              Here you can control the character_set_connection explicitly; now the MySQL client will treat the database connection in the forced charset definition. IMHO, this is not at all desirable, and gets extremely confusing.
              That said, using SET NAMES and changing the charset to utf8 may actually let databases with latin1 (or other charsets) present data to PHP in UTF-8 (i.e. if I wanted to use UTF-8 on the backend to utilize a UTF-8 language file), to some degree anyway, but I believe they would have had to have stored data to the tables in that manner in the first place. But I’ll stand by my advice to use UTF-8/utf8_ all around, to avoid the inevitable confusion...
                • 6726
                • 7,075 Posts
                Quote from: OpenGeek at Oct 16, 2008, 12:09 AM
                But, as for the real problem, I’m getting confused.  Is the problem that the regular or advanced upgrade set the database_connection_charset variable to an empty string in the config file?

                The initial problem was having strangely encoded characters in the backend : tree and when editing content after upgrade. This happened when I did the first upgrade because I forgot that this particular website had a latin1 collation and set it wrong manually to utf8 (I always use advanced install ever since there is one).

                Then the problem of the empty config file happened when I re-run the 0.9.6.3 RC1 installer and choose the proper DB collation (latin1) but the wrong DB connection (utf8) in advanced install. That’s why I said :

                Would there be a problem when the connection charset is not "aligned" with the collation (asking a dumb question since I understand nothing at all this !!!) ?

                It didn’t mean for the topic to imply that there was a problem for every case with the installer (but topic titles are too short I couldn’t add that), just that if you set things wrong (DB collation different of DB connection), you might have an emtpy database_connection_charset in config.inc.php.
                  .: COO - Commerce Guys - Community Driven Innovation :.


                  MODx est l'outil id
                • Why didn’t you just do a standard upgrade?
                    Ryan Thrash, MODX Co-Founder
                    Follow me on Twitter at @rthrash or catch my occasional unofficial thoughts at thrash.me
                    • 6726
                    • 7,075 Posts
                    I didn’t precisely because I had had issues with characters before with the standard upgrade...
                      .: COO - Commerce Guys - Community Driven Innovation :.


                      MODx est l'outil id
                    • Quote from: davidm at Oct 16, 2008, 07:17 AM

                      Then the problem of the empty config file happened when I re-run the 0.9.6.3 RC1 installer and choose the proper DB collation (latin1) but the wrong DB connection (utf8) in advanced install. That’s why I said :

                      Would there be a problem when the connection charset is not "aligned" with the collation (asking a dumb question since I understand nothing at all this !!!) ?

                      It didn’t mean for the topic to imply that there was a problem for every case with the installer (but topic titles are too short I couldn’t add that), just that if you set things wrong (DB collation different of DB connection), you might have an emtpy database_connection_charset in config.inc.php.
                      Hmmm; we’re getting closer to clearing the confusion; now to figure out how selecting a charset/collation different than your existing database collation would be leaving it blank...