We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
    • 17412
    • 270 Posts
    Hi there, has any one got a custom modifier snippet in their toolset that will strip out html tags by class or id?
      • 3749
      • 24,544 Posts
      Not me. You could do it with preg_replace(). It might also be possible to do it with JQuery if you just want to remove those tags from the output.
        Did I help you? Buy me a beer
        Get my Book: MODX:The Official Guide
        MODX info for everyone: http://bobsguides.com/modx.html
        My MODX Extras
        Bob's Guides is now hosted at A2 MODX Hosting
        • 17412
        • 270 Posts
        Hi Bob, so the way I'm doing this at the moment > parsing a Google+ feed and using CSS to hide some content in the < description > field. It works though I'd rather just zap the unrequired content at source. Not entirely sure how to approach this as I'd need to target both the start of the tag with class or id and also it's closing tag > rather than just stripping out or replacing it with something else.

        So something along these lines I guess:

        Snippet: attrizap

        <?php
        $zap = preg_replace('\$input\', ""');
        return $zap;


        Usage:

        [[+content:attrizap=`<div class="someClass">.+</div>`]]
          • 3749
          • 24,544 Posts
          One of the many, many great features of the PhpStorm editor is a regex find and replace that will highlight what it's going to do as you type the regex. It shows all the matches it finds and what they will be replaced by. Once you get it right, you just paste the pattern into your preg_match statement. It's really a comfort to see that you're not going to trash stuff you don't want altered and it beats the heck out of trial and error.

          I would do it with two separate preg_replaces, one for classes and one for IDs.

          [[+content:attrizap=`idName,className`]]


          Something like this (assuming that they're all divs):

          <?
          $output = '';
          $replace = '';
          $options = explode(",", $options);
          
          $idName = $options[0];
          $className = $options[1];
          $pattern1 = "/<div\s*class\s*=\s*.*\".*" . $className . ".*\".*>.*</div>/";
          $pattern2 = "/<div\s*id\s*=\s*.*\".*" . $idName . ".*\".*>.*</div>/";
          
          $output = preg_replace($pattern1, $replace, $input);
          $output = preg_replace($pattern2, $replace, $output;
          
          return $output;
          


          The pattern could be simpler if you're absolutely sure of the format (e.g., that there will be no spaces around the equals sign and no cases of multiple classes or extra spaces in the class or id definitions).

          [update] I should also mention that if this is a displayed page, you could also just hide or empty the divs with JQuery. [ed. note: BobRay last edited this post 11 years, 2 months ago.]
            Did I help you? Buy me a beer
            Get my Book: MODX:The Official Guide
            MODX info for everyone: http://bobsguides.com/modx.html
            My MODX Extras
            Bob's Guides is now hosted at A2 MODX Hosting
            • 40045
            • 534 Posts
            Maybe overkill, but I have made very good experience with phpQuery https://github.com/punkave/phpQuery (check original google.code repo for documentation!) that does exactly what you're trying to achieve...but as said before, including a whole library is maybe overkill for a "little" output filter =D...
            • Might it be easier to use preg_match to grab the div (or what ever tag) contents that you want to keep?
              • There's some great functionality in PHP for DOM manipulation. It'll save you a lot of headaches over trying to write your own regexs for everything. http://www.php.net/manual/en/book.dom.php

                As an example, here's a little snippet I wrote to clean up the links in a feed of Facebook wall posts. It's called as an output filter, like [[+summary:cleanFBlinks]].

                <?php
                $dom = new DOMDocument;
                $dom->loadHTML('<?xml encoding="UTF-8">' . $input);  // UTF-8 or some characters make problems
                foreach ($dom->getElementsByTagName('a') as $node) {
                	$node->removeAttribute('onclick');
                	$node->removeAttribute('onmouseover');
                	$node->removeAttribute('rel');
                	$node->removeAttribute('id');
                	$node->removeAttribute('title');
                	$node->removeAttribute('target');
                	$node->removeAttribute('style');
                    $href = $node->getAttribute('href');
                	$node->setAttribute('href', urldecode(substr($href, strpos($href, '=')+1, strrpos($href, '&h=') - strlen($href) )));
                }
                foreach ($dom->getElementsByTagName('img') as $node) {
                	$node->removeAttribute('class');
                }
                $output = substr($dom->saveXML($dom->getElementsByTagName('body')->item(0)), 6, -7); // Strip off <body></body> tags
                return $output;
                


                Makes life pretty easy!
                  Extras :: pThumbResizerimageSlimsetPlaceholders
                  • 17412
                  • 270 Posts
                  Thanks for all the input - fantastic!
                  I've gone with Bob's code although I might see if I can adapt to Everett's suggestion which cuts to the nub of what I'm looking to do for this case.
                    • 3749
                    • 24,544 Posts
                    preg_match() would let you grab just what you want to keep, but it gets tricky if there are nested divs.
                      Did I help you? Buy me a beer
                      Get my Book: MODX:The Official Guide
                      MODX info for everyone: http://bobsguides.com/modx.html
                      My MODX Extras
                      Bob's Guides is now hosted at A2 MODX Hosting