Quote from: tomtom at Jun 28, 2006, 08:18 AM
Document caching (of MODx) won’t help too much if you look at the figures. Just the time PHP takes is already too much.
I’m totally interested in discussing this further with you.
Can you explain a little bit more on how you get all those number? I’m assuming that when the first time you load the page, it will required a lot more workload, compare to the second time you load the page, if you use page caching system on MODx.
I’ll help you understand the big picture on how MODx work, and if you can justify the number again for us, it will be great.
Basically MODx parser will always load all the cache file that’ve pointed out in your benchmark result, which is why I can understand that the amount of memory the php requires is increasing rapidly but following the ammount of the pages that you’ve loaded into the database. The reason is because it needs to loads all the pages id, alias, and etc into the cache, which is if you have 20,000 pages, it will have more than 40,000 line of code inside the cache. We might need to start thingking a better way to cache this document information without sacrificing the system performance on every new page request.
The second phase, MODx will need a way to determine the output for the current page request. During this time, MODx will use the above cached array data for document to determine which document id need to be loaded to the front page. I believe this will be one of those performce penalty on MODx, considering the amount of array data that needs to be processed over and over again.
The third phase, the system will check the cache directory, if it does find a cache file for that specific file, it will load the page from the cache file, but if it doesn’t, it will load it from database, and parse the whole page again. Basically the bottle neck that I can see from this approach is the amount of cache files that keep accumulating inside the directory, and if you use linux, Linux known for its problem with having a bunch of files in 1 directory. So from your benchmark, we will have approximately 20,000 files. This will be a pain in the neck when cleaning the site cache, because the system will have to remove all those 20,000 files at once. I’m not sure about the php function to read the files, but I hope it won’t be too much of a problem.
The last phase will be parsing the cache data or the parsed data from previous to be parsed again, basically this is to parse the uncache snippet that is not being cached yet. I believe this won’t be a problem.
So my conclusion, there is no wonder that the php processing time took so much time, which increase quite drastically following the amount of pages that the site have. The amount of memory needed is also reasonable. The only thing that is quite unexpected is the mysql processing time. I have no idea why it increase quite drastically compare to php processing time and memory allocation needed.
Could you justify this for me, when you load the page, did you benchmarked it base on uncache page or cache page? It suppose to make a different in mysql processing time. I believe if you have a cached page, there is only 3-5 mysql request, but I’m not 100% sure.
PS: I might be wrong, but I open to any suggestion, so we can improve the current core code. Do you have any experience in optimizing code tomtom?