-
- 1,611 Posts
The problem with sites with very large numbers of documents in not really related to the number of rows that you can effectively store in a MySQL table. There are other bottlenecks in the process that will come into play long before you reach most MySQL limits. And MySQL is usually the database backend for either MODx or Drupal (and many other systems), so there probably wouldn’t be any major difference in table optimization or anything else that would be significant factors to consider (they’re both relatively efficient, given the number of features that they include).
In MODx the first bottleneck that you hit for very large sites is the cache files. There are ways to reduce the size of these, but sites with many thousands of pages will necessarily have very large site cache files. As I understand it, these files are loaded and parsed on every MODx request, so this can take a whole lot of processing power (so if you’re on your own fast dedicated server, you might be fine, but if you’re on shared hosting you will be constantly overloading the CPU).
This is a built-in limitation to MODx, and it will not be resolved until the 0.9.7 version is released with a totally different caching system. I don’t know how Drupal handles caching, so I can’t comment on that. But my understanding is that if you’re talking about a site that needs to handle 100,000+ pages, no standard full-featured CMS system is going to be able to deal with that efficiently and most people in your situation end up writing their own bare-bones code that is optimized for their needs.
The MODx caching system should be very helpful for things like Ditto pages, however. In addition to the main site cache file individual pages (or parts of them) can be cached, which saves a lot of database calls and PHP parsing for any content that’s static. This system is a speed booster and resource saver for most sites, but for huge sites like yours it becomes a virtual throttle.
So my general advice would be to radically reconsider the tools that you’re using for this project. I don’t know if you can reasonably expect a complex script like Ditto to index hundreds of thousands of pages dynamically and generate output on the fly for multiple viewers at the same time and with no long delays (and the same is true for AjaxSearch, etc.). I would be looking more at systems that are optimized on the database end for very large sites (with prefetch and search index tables, etc.), and I don’t know that you’re going to find such a system that works out-of-the-box and provides you with all of the features of a CMS system (although for your sake I hope that I’m wrong).
You may want to consider using the upcoming MODx 0.9.7 release to build your site (since it may be more capable of dealing with what you’d throw at it), but keep in mind that it is still under development at the moment.
-
- 7,075 Posts
It’s never been meant to handle that much documents, but like you sometimes I try anyway
On top of
what Jason said about the new caching mechanism for 0.9.7, it might be of interrest to you to check out the
how to enable caching with memcached? thread as it is an example of what the new core will allow
.: COO - Commerce Guys - Community Driven Innovation :.
MODx est l'outil id
-
- 2,542 Posts
Hi,
for testing I have modified the files used by the cache system in the 096.
I could post them with instructions if you wish to test.
:-)
-
- 1,611 Posts
@Stefan: Have you looked into using the new 0.9.7 version?
-
- 1,611 Posts
Quote from: Ricjustsaid at Apr 25, 2008, 03:59 AM
This kind of scared me... the current site I’m working on (well, getting ready to start working on ) is basically an article site with a few hundred pages ranging in size from a few kb to 35kb or so, and they probably add up to a dozen mb at least... I haven’t started importing the content yet, but am I going to have issues with the cache file when I do this? I’m on a dedicated opteron 246 with 2GB of memory, and the last thing I need is to end up with a crashed site after importing the content.
No need to be scared. The example that I was responding to in that message isn’t really relevant to what it sounds like you want to do. MODx is perfectly happy with sites of hundreds of pages (I’ve had some over 3,500 pages with no noticeable decrease in performance).
MODx has a site cache file that includes a document index and whatever resources you have installed. And in addition it can create individual cache files for each page that you set to be cached. The latter files are only parsed when you load those specific pages, so they don’t cumulatively slow down your site (in fact, they should speed it up).
There’s no reason to think that you’d have any trouble creating the site you describe in MODx.