    I want to create a sitemap for a page with more than 30.000.000 pages. The page is daily updating, removing and adding new pages.

    I found this php script which I would like to run with a cron job.

    Which seems to be pretty cool since it offers following extras:

    • Automatically gzip all of your sitemaps (if desired)
    • Create (and update) a sitemap index file
    • Allow you to separate your urls into different files
    • Limit the number of urls in each file to 50,000
    • Properly escape and format all of the required and optional fields
    • Ping the search engines when your sitemap(s) have been updated

    I have all URIs in the table "myuri" in the column "uri", the entries are written e.g. "/this-is-a-page.html".

    Unfortunately I dont get it working yet with my very limited php coding skills.

    define("BASE_URL", "http://example.com/");
    define ('BASE_URI', $_SERVER['DOCUMENT_ROOT'] . '/');
     //FILL THIS IN//
    $host = 'hostname';
    $username = 'user';
    $password = 'password';
    $dbname = 'database';
    $port = 3306;
    $charset = 'utf-8';
    class Sitemap {
      private $compress;
      private $page = 'index';
      private $index = 1;
      private $count = 1;
      private $urls = array();
      public function __construct ($compress=true) {
        ini_set('memory_limit', '75M'); // 50M required per tests
        $this->compress = ($compress) ? '.gz' : '';
      public function page ($name) {
        $this->page = $name;
        $this->index = 1;
      public function url ($url, $lastmod='', $changefreq='', $priority='') {
        $url = htmlspecialchars(BASE_URL . $url);
        $lastmod = (!empty($lastmod)) ? date('Y-m-d', strtotime($lastmod)) : false;
        $changefreq = (!empty($changefreq) && in_array(strtolower($changefreq), array('always', 'hourly', 'daily', 'weekly', 'monthly', 'yearly', 'never'))) ? strtolower($changefreq) : false;
        $priority = (!empty($priority) && is_numeric($priority) && abs($priority) <= 1) ? round(abs($priority), 1) : false;
        if (!$lastmod && !$changefreq && !$priority) {
          $this->urls[] = $url;
        } else {
          $url = array('loc'=>$url);
          if ($lastmod !== false) $url['lastmod'] = $lastmod;
          if ($changefreq !== false) $url['changefreq'] = $changefreq;
          if ($priority !== false) $url['priority'] = ($priority < 1) ? $priority : '1.0';
          $this->urls[] = $url;
        if ($this->count == 50000) {
        } else {
      public function close() {
      private function save () {
        if (empty($this->urls)) return;
        $file = "sitemap-{$this->page}-{$this->index}.xml{$this->compress}";
        $xml = '<?xml version="1.0" encoding="UTF-8"?>' . "\n";
        $xml .= '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . "\n";
        foreach ($this->urls as $url) {
          $xml .= '  <url>' . "\n";
          if (is_array($url)) {
            foreach ($url as $key => $value) $xml .= "    <{$key}>{$value}</{$key}>\n";
          } else {
            $xml .= "    <loc>{$url}</loc>\n";
          $xml .= '  </url>' . "\n";
        $xml .= '</urlset>' . "\n";
        $this->urls = array();
        if (!empty($this->compress)) $xml = gzencode($xml, 9);
        $fp = fopen(BASE_URI . $file, 'wb');
        fwrite($fp, $xml);
        $this->count = 1;
        $num = $this->index; // should have already been incremented
        while (file_exists(BASE_URI . "sitemap-{$this->page}-{$num}.xml{$this->compress}")) {
          unlink(BASE_URI . "sitemap-{$this->page}-{$num}.xml{$this->compress}");
      private function index ($file) {
        $sitemaps = array();
        $index = "sitemap-index.xml{$this->compress}";
        if (file_exists(BASE_URI . $index)) {
          $xml = (!empty($this->compress)) ? gzfile(BASE_URI . $index) : file(BASE_URI . $index);
          $tags = $this->xml_tag(implode('', $xml), array('sitemap'));
          foreach ($tags as $xml) {
            $loc = str_replace(BASE_URL, '', $this->xml_tag($xml, 'loc'));
            $lastmod = $this->xml_tag($xml, 'lastmod');
            $lastmod = ($lastmod) ? date('Y-m-d', strtotime($lastmod)) : date('Y-m-d');
            if (file_exists(BASE_URI . $loc)) $sitemaps[$loc] = $lastmod;
        $sitemaps[$file] = date('Y-m-d');
        $xml = '<?xml version="1.0" encoding="UTF-8"?>' . "\n";
        $xml .= '<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . "\n";
        foreach ($sitemaps as $loc => $lastmod) {
          $xml .= '  <sitemap>' . "\n";
          $xml .= '    <loc>' . BASE_URL . $loc . '</loc>' . "\n";
          $xml .= '    <lastmod>' . $lastmod . '</lastmod>' . "\n";
          $xml .= '  </sitemap>' . "\n";
        $xml .= '</sitemapindex>' . "\n";
        if (!empty($this->compress)) $xml = gzencode($xml, 9);
        $fp = fopen(BASE_URI . $index, 'wb');
        fwrite($fp, $xml);
      private function xml_tag ($xml, $tag, &$end='') {
        if (is_array($tag)) {
          $tags = array();
          while ($value = $this->xml_tag($xml, $tag[0], $end)) {
            $tags[] = $value;
            $xml = substr($xml, $end);
          return $tags;
        $pos = strpos($xml, "<{$tag}>");
        if ($pos === false) return false;
        $start = strpos($xml, '>', $pos) + 1;
        $length = strpos($xml, "</{$tag}>", $start) - $start;
        $end = strpos($xml, '>', $start + $length) + 1;
        return ($end !== false) ? substr($xml, $start, $length) : false;
      public function ping_search_engines () {
        $sitemap = BASE_URL . 'sitemap-index.xml' . $this->compress;
        $engines = array();
        $engines['www.google.com'] = '/webmasters/tools/ping?sitemap=' . urlencode($sitemap);
        $engines['www.bing.com'] = '/webmaster/ping.aspx?siteMap=' . urlencode($sitemap);
        $engines['submissions.ask.com'] = '/ping?sitemap=' . urlencode($sitemap);
        foreach ($engines as $host => $path) {
          if ($fp = fsockopen($host, 80)) {
            $send = "HEAD $path HTTP/1.1\r\n";
            $send .= "HOST: $host\r\n";
            $send .= "CONNECTION: Close\r\n\r\n";
            fwrite($fp, $send);
            $http_response = fgets($fp, 128);
            list($response, $code) = explode (' ', $http_response);
            if ($code != 200) trigger_error ("{$host} ping was unsuccessful.<br />Code: {$code}<br />Response: {$response}");
      public function __destruct () {
    // start part 2
    $sitemap = new Sitemap;
    if (get('pages')) {
      $result = mysql_query("SELECT uri FROM myuri"); // 20 pages
      while (list($url, $created) = $result->fetch_row()) {
        $sitemap->url($url, $created, 'yearly');
    if (get('posts')) {
      $result = mysql_query("SELECT uri FROM myuri"); // 70,000 posts
      while (list($url, $updated) = $result->fetch_row()) {
        $sitemap->url($url, $updated, 'monthly');
    unset ($sitemap);
    function get ($name) {
      return (isset($_GET['update']) && strpos($_GET['update'], $name) !== false) ? true : false;

    I called the php file with

    Now I get the following error message

    Fatal error: Call to a member function fetch_row() on a non-object in ... on line 188

    The coresponding line is in this part:
    if (get('pages')) {
      $result = mysql_query("SELECT uri FROM myuri"); // 20 pages
      while (list($url, $created) = $result->fetch_row()) {
        $sitemap->url($url, $created, 'yearly');

    How do I correct this error? Or does anyone knows of a script that does the same and is working?

    This question has been answered by sh0ck23. See the first response.

    [ed. note: sh0ck23 last edited this post 10 years, 10 months ago.]
        Thanks Susan, unfortunately I can not run a solution which is based on the MODX site_content table since the majority of my pages is not entered there since modx can only handle such a large table with dificulties. I went for a solution that is firing SQL SELECTS on a OnPageNotFound (error_page) event.

        That for I needed to go for a non-modx solution.

        Luckily I managed to get the above script running by changing the code as follows:
        if (get('pages')) {
          $result = mysql_query("SELECT uri FROM myuri"); // 20 pages
          while (list($url, $created) = $result->fetch_row($result)) {
            $sitemap->url($url, $created, 'yearly');

        The script is awesome I can only recommend it to anyone dealing with large scale websites.