OK Bob, I can't say exactly what the cause of the issue was, or if it has completely gone, but I made several changes, as follows:
I switched the website to run on
PHP-FPM in nginx. That's in the Plesk > Hosting Settings page for the domain as "FPM application served by nginx" (as opposed to CGI or fastCGI with Apache).
I suspected my setup wasn't working correctly so I reinstalled PHP-FPM and the version(s) of PHP that came with it (via Plesk, but couldn't get it to start, and then via command line).
I had trouble trying to start PHP-FPM. First an imagick error:
Starting php-fpm: [06-Nov-2016 11:45:44] NOTICE: PHP message: PHP Warning: PHP Startup: Unable to load dynamic library '/usr/lib64/php/modules/imagick.so' - libMagickWand.so.2: cannot open shared object file: No such file or directory in Unknown on line 0
So I reinstalled imagick:
Installed:
php-pecl-imagick.x86_64 0:2.2.2-5.el6
Then there was a pool issue:
[06-Nov-2016 11:45:44] ERROR: No pool defined. at least one pool section must be specified in config file
[06-Nov-2016 11:45:44] ERROR: failed to post process the configuration
[06-Nov-2016 11:45:44] ERROR: FPM initialization failed
That turned out to be because PHP-FPM just hadn't been selected for any of the domains in Plesk. As soon as I did this, FPM started running.
I suspect that I could have avoided the whole command-line effort if I'd just installed PHP-FPM in Plesk, then selected it for at least one domain.
I also suspected that
page speed might be an underlying cause for the 502 error, so I added various changes to nginx to help it deliver faster:
I enabled gzip so that the server could compress files before sending:
References:
https://developers.google.com/speed/docs/insights/EnableCompression
http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_comp_level
https://mattstauffer.co/blog/enabling-gzip-on-nginx-servers-including-laravel-forge
https://kb.plesk.com/en/122628
Using the last two refs there, I created a new config file:
\etc\nginx\conf.d\gzip.conf
containing:
gzip on;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";
gzip_proxied any;
gzip_types
application/x-javascript
application/xml
application/xml+rss
image/x-icon
image/bmp
image/svg+xml
text/plain
text/css
text/xml
text/javascript
application/javascript;
gzip_vary on;
I then enabled
cache control (first via my .htaccess file, till d'oh! I realized I was dealing with nginx and not Apache).
I added the following into the additional nginx directives box for my domain in Plesk (Apache & nginx Settings page):
location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
expires 30d;
add_header Pragma public;
add_header Cache-Control "public";
try_files $uri @fallback;
}
(Careful with cut and paste! I wasted a lot of time figuring out that curly quotes were breaking this).
Also on the Apache & nginx Settings page, ensure that you select both
Smart static files processing and
Serve static files directly by nginx. In the list of file extensions for the latter, make sure you remove any that are listed in the additional nginx directives, eg (js|css|png|jpg|jpeg|gif|ico).
Refs:
http://florianjensen.com/2013/08/03/setting-expires-headers-with-plesk-11-5-and-nginx/
http://kbeezie.com/nginx-configuration-examples/
https://talk.plesk.com/threads/browser-caching-with-nginx-not-working.337315/
I checked and these additional nginx directives appeared in the following file:
/var/www/vhosts/system/mydomain.com/conf/vhost_nginx.conf
Restarted nginx.
I then moved on to the arrangement of my page itself, testing with and following recommendations from google's pagespeed tool:
https://developers.google.com/speed/pagespeed/insights
My starting scores were around:
Mobile 35 / 100
Desktop 44 / 100
I compressed images further, shifted js files to the end of the page <body>, merged css files, minified js and css files, split out bits of css and js and shifted them to the <head> when they were required for above-the-line page load speed.
This involved a LOT of page breaking and fixing.
I also installed two caching extras on MODX: getCache and microcache. I have not configured these and have no idea at the moment of their full effect, if any. I also need to research which caching extras will work well together and which will conflict.
In combination with the server changes, all this resulted in the following scores:
Mobile 65 / 100
Desktop 75 / 100
Since all the changes I've had only one report of the 502 error.
What caused it? I don't know! But I learned a hell of a lot in around 4 days (and nights) of doing all this. There is still a lot of research, learning and fine-tuning to do, and testing tools to find.
If anyone needs any further details, please ask. If I have it written down or can remember it I'll be happy to pass it on.