We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
    • 43374
    • 39 Posts
    I am building a multilingual large scale website (min. 30.000.000 pages) getting its content out of an external api. I am constantly running cron jobs to push the data into a db on my webserver (adding new contents, updating existing contents and deleting expired contents).

    In a first attempt I pushed the data of 1.000.000 pages directly into site_content what is generally working but I guess absolutly not ideal.

    Would be great to hear your thoughts what would be the best way to deal with that. Espacially I am looking for a good way to create and update the resources. Before digging deeper into topics like xPDO it would be nice to get some ruff general recommendations, respecting things like:

    • best way to create resources via script
    • not blowing up the cache unnesserary
    • still beeing able to clear cache via manager


    Cheers

    This question has been answered by multiple community members. See the first response.

    • Just create a single view for dynamically showing your content from the existing tables? That would be infinitely more scalable, and you could control your own caching in whatever way made the most sense for the content.
        • 43374
        • 39 Posts
        Unforunately I need for each of the 30.000.000 items a single page under a specific URL I can list in the sitemap.xml for SEO reasons

        Edit: But yes, after thinking about it
        Just create a single view for dynamically showing your content from the existing tables
        sounds like a good idea it should be possible to find a way to add the URI for each item so that it looks for google as a single page [ed. note: sh0ck23 last edited this post 10 years, 10 months ago.]
        • discuss.answer
          Quote from: sh0ck23 at Jun 24, 2013, 09:03 PM
          Unforunately I need for each of the 30.000.000 items a single page under a specific URL I can list in the sitemap.xml for SEO reasons

          Edit: But yes, after thinking about it
          Just create a single view for dynamically showing your content from the existing tables
          sounds like a good idea it should be possible to find a way to add the URI for each item so that it looks for google as a single page
          You can route custom URLs using a Plugin attached to the OnPageNotFound event, simply forwarding to your view Resource with the appropriate GET parameters set to identify the specific item to be viewed.
            • 20371
            • 58 Posts
            Quote from: opengeek at Jun 24, 2013, 08:59 PM
            Just create a single view for dynamically showing your content from the existing tables?

            This is definitely the best idea. Having 30m resources in your left nav would be pointless / unusable.

            Re urls.. wouldn't it be easier to use a .htaccess approach? rewrite urls to a query string ala:
            http://foo.com/catalogue/abc123
            rewrites to
            http://foo.com/index.php?q=2&catalogue=abc123
            


            Then a simple snippet could pull the "abc123" from the query string and return the appropriate resource.

            Having said that, I've never attached anything to events like "page not found" so that may well be a better solution, just that I don't know what I'm doing.

            • Quote from: Mr5o1 at Jun 25, 2013, 01:47 AM
              Re urls.. wouldn't it be easier to use a .htaccess approach? rewrite urls to a query string ala:
              http://foo.com/catalogue/abc123
              rewrites to
              http://foo.com/index.php?q=2&catalogue=abc123
              


              Then a simple snippet could pull the "abc123" from the query string and return the appropriate resource.

              Having said that, I've never attached anything to events like "page not found" so that may well be a better solution, just that I don't know what I'm doing.
              You can certainly use rewrites to accomplish that. Perhaps it's easier to implement, but certainly not as portable as a plugin. Consider moving from Apache to nginx, where the rewrites would have to be modified for the platform.
              • discuss.answer
                Here is my favorite example of such a plugin; it's written to handle multi-language sites using Babel, but the basic idea should apply. https://gist.github.com/gadamiak/3812853

                You can use resources, set to not show in the Resource Tree, using MIGXdb and a CMP to manage them http://rtfm.modx.com/display/ADDON/MIGXdb.Manage+Events-Resources+in+a+CMP+with+help+of+MIGXdb
                  Studying MODX in the desert - http://sottwell.com
                  Tips and Tricks from the MODX Forums and Slack Channels - http://modxcookbook.com
                  Join the Slack Community - http://modx.org
                • discuss.answer
                  • 4172
                  • 5,888 Posts
                    -------------------------------

                    you can buy me a beer, if you like MIGX

                    http://webcmsolutions.de/migx.html

                    Thanks!
                    • 43374
                    • 39 Posts
                    Thank you all for your recomendations! I will report back how i solved it finally.

                    Cheers