We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
    • 19153
    • 53 Posts
    Hi guys

    In all my modx sites (which have friendly urls turned on) i have the original www.mysite.com indexed in google but also www.mysite.com/mysite gets indexed too... >:(

    This second page is not wanted and will cause problems in google with duplicate content on the site therefore devaluing my seo efforts. Is this a massive problem with modx?

    I tried to do a 301 redirect but all i get then is:

    http://www.mysite.com/?q=mysite

    Any ideas?
      • 19339
      • 6 Posts
      Maybe I can help you out;
      This incorrect indexing of ’duplicate content’ stems from being able to get to the same content in
      different ways:
      - via www*yourdomain.com/index.php AND
      - via www*yourdomain.com/index.html (!)
      - by entering h**p://yourdomain.com
      - and by entering yourdomain.com

      Because there’s 4 ways to get to the same content, the content is indexed as being duplicate.
      I understand you know this but others might read this too cool

      Adding rewrite conditions and rules in htaccess prevents this from happening but to get it to work the
      correct syntax is crucial.
      I do not know how you wrote the redirects but what I show in this post works for me.
      I tested the extra code you find below; it is in my particular .htaccess file and in my case it does not
      conflict with the other rules that are ’active’.

      At the end of the post I’ll give my full .htaccess file so you can compare it to your own.
      [I haven’t tried what happens when other settings are active like for instance the SEO Strict URLs plugin ]

      Sowww...having said all that; on to the work ahead lol laugh
      First backup your root .htaccess file;
      in case of calamities you have a spare to save the day.

      Now open your root .htacess and find this line:

      # Exclude /assets and /manager directories and images from rewrite rules


      Before that line put this code:

      
      #Redirect http://www.domain.com/index.html to http://www.domain.com/
      RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
      RewriteRule ^index\.html$ http://www.domain.com/ [R=301,L]
      
      #Redirect http://www.domain.com/index.html to http://www.domain.com/
      RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/
      RewriteRule ^index\.php$ http://www.domain.com/ [R=301,L]
      
      

      Leave an empty line after and before this inserted code.
      The code blocks strip index.html and index.php from url’s
      (there is probably a way to cover both files in one code block but I am no expert at this so
      I wrote a block for each file extension)
      Also: I found out that things did not work when I placed the codes at the end of the file
      so I experimented with their positions ’till I reached a place where things worked out okay.

      last step:
      at the bottom of your file place this code (remember to leave an empty line above the code block)
      
      #Redirect http://www.domain.com/index.html to http://www.domain.com/
      RewriteCond %{HTTP_HOST} ^domain\.com [NC]
      RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]
      
      


      Final notes:
      - The rules are for .htaccess on apache.
      - Either leave the comment lines as they are (to serve as reminders of what the rules do)
      Or remove them completely but do not uncomment them.

      Hope this helps, as promised my complete htaccess so you can compare settings;
      my code blocks contain the words domain.com so you can see where they are:
      # For full documentation and other suggested options, please see
      # http://svn.modxcms.com/docs/display/MODx096/Friendly+URL+Solutions
      # including for unexpected logouts in multi-server/cloud environments
      # and especially for the first three commented out rules
      
      #php_flag register_globals Off
      #AddDefaultCharset utf-8
      #php_value date.timezone Europe/Moscow
      
      Options +FollowSymlinks
      RewriteEngine On
      RewriteBase /
      
      # Fix Apache internal dummy connections from breaking [(site_url)] cache
      RewriteCond %{HTTP_USER_AGENT} ^.*internal\ dummy\ connection.*$ [NC]
      RewriteRule .* - [F,L]
      
      # Rewrite domain.com -> www.domain.com -- used with SEO Strict URLs plugin
      #RewriteCond %{HTTP_HOST} .
      #RewriteCond %{HTTP_HOST} !^www\.example\.com [NC]
      #RewriteRule (.*) http://www.example.com/$1 [R=301,L]
      
      #Redirect http://www.domain.com/index.html to http://www.domain.com/
      RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
      RewriteRule ^index\.html$ http://www.domain.com/ [R=301,L]
      
      #Redirect http://www.domain.com/index.html to http://www.domain.com/
      RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\ HTTP/
      RewriteRule ^index\.php$ http://www.domain.com/ [R=301,L]
      
      # Exclude /assets and /manager directories and images from rewrite rules
      RewriteRule ^(manager|assets)/*$ - [L]
      RewriteRule \.(jpg|jpeg|png|gif|ico)$ - [L]
      
      # For Friendly URLs
      RewriteCond %{REQUEST_FILENAME} !-f
      RewriteCond %{REQUEST_FILENAME} !-d
      RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
      
      # Reduce server overhead by enabling output compression if supported.
      #php_flag zlib.output_compression On
      #php_value zlib.output_compression_level 5
      
      #Redirect http://www.domain.com/index.html to http://www.domain.com/
      RewriteCond %{HTTP_HOST} ^doman\.com [NC]
      RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]
      


      Thx for reading and Good Luck! smiley
        • 22448
        • 241 Posts
        Hi Michiel,

        thanks a lot for a detailed explanation.
        unfortunately even though i copied your htaccess (with the domain changes of course)
        i’m getting into an infinite loop. Firefox says that rewriting will never end.

        Have you seen this kind of issue?
        Thanks
        • I would guess that the infinite loop comes from trying to redirect friendly URL requests to index.php?q=xx and then trying to 301 redirect these back to root. It has to be possible to request index.php because that’s what is used to generate all of your pages.

          There are plugins you can use to attempt to tackle this problem, but my advice would be to simply use a canonical link. It works well and has the advantage that it also resolves duplicate content issues that you haven’t thought of.

          There’s a canonical URL snippet for Revo.
            • 22448
            • 241 Posts
            thanks for the help.
            after doing a site that required multiple languages and using YAMS for the purpose i found that it handles SEO friendly urls very well. so now i use YAMS even for a single language websites.