<![CDATA[ robots.txt on modxcloud????? - MODX Community Forums]]> https://forums.modx.com/thread/?thread=82805 <![CDATA[robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-456920
The directory structure of Contexts is generated dynamically by modx, so i wonder how i can achive an "mainly" invisible context (for spiders and searchbots) for crazy testing purposes in cloud.

Maybe i have to add some Nginx rules... maybe i can try to do the robots.txt in modx itself?

What the right way to get a invisible context (by searchbots).


Thank you,
with best regards,
Chris


]]>
theoretiker Feb 28, 2013, 06:51 AM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-456920
<![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-544395

  • Internal URLs will always serve deny: all to robots. These are the ones that start with a cXXXX number.
  • You can enable robots.txt for your cloud urls. These will look like site-name.user-account.modxcloud.com.
  • Robots.txt should be enabled by default for any custom domains/URLs you add to a project in MODX Cloud.
]]>
rethrash Aug 22, 2016, 10:31 PM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-544395
<![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-544289 Quote from: netProphET at Mar 28, 2013, 02:35 AM
I'm late to the party here, but wanted to inform you all that we did implement a new feature last week, where new Developer Clouds automatically have a robots.txt (virtual) file that directs robots not to index any of the site. If you want to override this default behavior, all you have to do is put an actual file in place.

Nice to see different ideas for solving the problem of excluding just parts of a site.

I'm even later to the party - but thought it worthwhile to clarify something in case other people stumble across this thread as I did.

So to clarify, the robots.txt that's returned for your Cloud URL - e.g. cxxxx.paas1.tx.modxcloud.com is always the default "deny all". And that's not changed even you enable the "Allow Search Engines to Index this site" option on the cloud. However that setting does change the robots.txt file returned for the "custom" URL assigned to that cloud .. e.g. dev.yourcompany.modxcloud.com.

This caught me out because I had implemented a robots.txt and was testing it using the cloud address, and it was refusing to serve my custom robots.txt file. Hope this helps someone else, and thanks to Mike Schell at Modx Cloud for clarifying this for me!

Cheers

Mark]]>
endacemark Aug 17, 2016, 08:55 PM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-544289
<![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-460714 Quote from: eighthday at Mar 28, 2013, 09:00 AM
Great, if we inject the development cloud into a production cloud is the robots file removed?

Yes it is. Not that it matters, but we're doing this via server configuration and not by putting an actual file in place. Gives us a lot more flexibility, and I wouldn't be surprised to see other cool little features in the future based on this idea.
]]>
netProphET Mar 28, 2013, 09:27 AM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-460714
<![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-460667 eighthday Mar 28, 2013, 04:00 AM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-460667 <![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-460610
Nice to see different ideas for solving the problem of excluding just parts of a site.
]]>
netProphET Mar 27, 2013, 09:35 PM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud?page=2#dis-post-460610
<![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458578
Have a nice time!!!! Enjoy modx and modx Community!!!! Its the greatest place i know...

]]>
theoretiker Mar 13, 2013, 11:54 AM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458578
<![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458576 sottwell Mar 13, 2013, 11:48 AM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458576 <![CDATA[Re: robots.txt on modxcloud????? (Best Answer)]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458575 Quote from: theoretiker at Mar 13, 2013, 04:38 PM

Can i use site_status in a context setting?

There shouldn't be a problem using it in a context setting.

Another idea could be to remove the "anonymous" group in the access permissions of the context. That should do the same: only users that are logged in to the manager (and with the right permissions) would be able to see the context.]]>
MathiasD Mar 13, 2013, 11:48 AM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458575
<![CDATA[Re: robots.txt on modxcloud?????]]> https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458569 Can i use site_status in a context setting?

I will try that ... great! Maybe a save solution...]]>
theoretiker Mar 13, 2013, 11:38 AM https://forums.modx.com/thread/82805/robots-txt-on-modxcloud#dis-post-458569