We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
    • 42046
    • 436 Posts
    I've got an xml/rss feed generated from resources using pdoResources:

    [[+longtitle:htmlent]]


    but some of the page titles and text contain non utf8 characters such as Ç and ğ which break the rss feed in validation and in apps. Does anyone know of an output modifier that could parse such characters into an acceptable character set for xml?
      • 3749
      • 24,544 Posts
      There is an htmlent filter, but I don't think it does exactly what you want.

      I think you would need a custom modifier. This might get you close (requires PHP 5.4 or higher):

      Create a snippet called RemoveCrap with this code:

      /* RemoveCrap snippet */
      function replace_invalid_byte_sequence($str) {
          return htmlspecialchars_decode(htmlspecialchars($str, ENT_SUBSTITUTE, 'UTF-8'));
      }
      
      return replace_invalid_byte_sequence($input);
      


      Then use this tag:

      [[+longtitle:RemoveCrap]]



      There is a solution here (the one with the most votes) that will actually replace the bad characters instead of deleting them, but you have to download a file to make it work:

      http://stackoverflow.com/questions/1401317/remove-non-utf8-characters-from-string
        Did I help you? Buy me a beer
        Get my Book: MODX:The Official Guide
        MODX info for everyone: http://bobsguides.com/modx.html
        My MODX Extras
        Bob's Guides is now hosted at A2 MODX Hosting