We launched new forums in March 2019—join us there. In a hurry for help with your website? Get Help Now!
    • 25819
    • 10 Posts
    My coworker has asked a similar question before, but no replies. I hope post on differnet places would help.

    After export our site, all the links on the exported pages are appended with strings, like "index.html?SN43ea93ba38ac8=f43463c9e913600d2...". I guess those are for the statistics, but It will be nice if those can be eliminated. Is there any flag to control that?

    If there is not, I can hack the code and replace the string with nothing. But I need to know exactly how they would look like. So, can someone tell me the correct format for those strings? Like "SN followed by a 46-digit string"?

    Thanks.
    • Unfortunately, the export site functionality has been (for the most part) untouched since our fork with Etomite, and I don’t think there is anyone that has really taken ownership of this code yet. I had planned to do some work on this, but alas, other priorities have taken my attention away from that task. I’m not even familiar with the why the strings would be there when exported. But I am aware there are many other issues with the export process, like no export of external references, no respect of friendly url settings, etc.

      If you do hack the export to make it more usable for your purposes, and you feel it would be appropriate for all MODx users, we’d love for you to contribute back any progress you make to the project. Feel free to contact me directly or continue the discussion here on the forums if you need specific help beyond the issue with the strings being appended to the URL, and sorry I can’t be of more help on that issue.

      Cheers
        • 32241
        • 1,495 Posts
        ryanc, is that a static page that you’re trying to import or dynamic page?

        Thanks
          Wendy Novianto
          [font=Verdana]PT DJAMOER Technology Media
          [font=Verdana]Xituz Media
        • I’ve just ran the export on both my local machine and my live server and didn’t get any strings being appended to my filenames.

          So, I’ve had a cursory glance through the export_site.static.php (actions/static/ folder in the manager) and I can’t see anything that would append the logging cookie values to the filename.

          Is it possible that this is perhaps a server setting rather than MODx being the culprit? I know this probably doesn’t help much but I don’t really have an answer huh
            Garry Nutting
            Senior Developer
            MODX, LLC

            Email: [email protected]
            Twitter: @garryn
            Web: modx.com
            • 25819
            • 10 Posts
            Djamoer, I am talking about the static, exported page.

            We have installed MODx on three different places, all with the same kind of problem. It’s true that there is no string appended in the export php script, so I guess it’s hidden somewhere in the main script, for computing the statistics. It might be a setting issue, as garryn pointed out, he didn’t have this problem.

            I actually wrote a perl script to export the site (so I can put in cron). No matter the page is exported by MODx or by perl, the problem occurs. That’s expectable because I use the same mechanism to do that.

            And as I mentioned, to get rid of that is easy, as long as I can know the format. Currently I filter out any string starts with "SN", 48-character long, and after ? or & sign. It seems to work on my test pages, but I need to confirm that format. There are hundreds of pages on the production server.

              • 32241
              • 1,495 Posts
              Hi Ryanc, I just found your friend’s post, when I was looking to reply to this post.
              http://modxcms.com/forums/index.php/topic,2987.0.html

              Anyway, because of my misunderstanding, I think I might need to come up with a solution for you, to repay my misunderstanding wink.
              Lets do this.

              Replace all the code inside your export_site.static.action.php inside manager/action/static folder
              <?php
              if(IN_MANAGER_MODE!="true") die("<b>INCLUDE_ORDERING_ERROR</b><br /><br />Please use the MODx Content Manager instead of accessing this file directly.");
              if(!$modx->hasPermission('edit_document')) {	
              	$e->setError(3);
              	$e->dumpError();	
              }
              
              // figure out the base of the server, so we know where to get the documents in order to export them
              $base = 'http://'.$_SERVER['SERVER_NAME'].str_replace("/manager/index.php", "", $_SERVER["PHP_SELF"]);
              
              // Parser function to parse document page into html string
              function fetchDocument($docid) {
              	$url = $modx->config['site_url'].ltrim($modx->makeUrl($docid), '/');
              	$ch = curl_init();
              	curl_setopt($ch, CURLOPT_URL, $url);
              	curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
              	$store = curl_exec ($ch);
              	$xml = curl_exec ($ch);
              	curl_close ($ch);
              	return $xml;
              }
              ?>
              
              <script>
              function reloadTree() {
              	// redirect to welcome
              	document.location.href = "index.php?r=1&a=7";
              }
              </script>
              
              <div class="subTitle">
              <span class="right"><img src="media/images/_tx_.gif" width="1" height="5"><br /><?php echo $_lang['export_site_html']; ?></span>
              </div>
              
              <div class="sectionHeader"><img src='media/images/misc/dot.gif' alt="." /> <?php echo $_lang['export_site_html']; ?></div><div class="sectionBody">
              <?php 
              
              if(!isset($_POST['export'])) {
              echo $_lang['export_site_message']; 
              ?>
              <fieldset style="padding:10px"><legend><?php echo $_lang['export_site']; ?></legend>
              <form action="index.php" method="post" name="exportFrm">
              <input type="hidden" name="export" value="export" />
              <input type="hidden" name="a" value="83" />
              <table border="0" cellspacing="0" cellpadding="2" width="400">
                <tr>
                  <td valign="top"><b><?php echo $_lang['export_site_cacheable']; ?></b></td>
                  <td width="30"> </td>
                  <td><input type="radio" name="includenoncache" value="1" checked="checked"><?php echo $_lang['yes'];?><br />
              		<input type="radio" name="includenoncache" value="0"><?php echo $_lang['no'];?></td>
                </tr>
                <tr>
                  <td><b><?php echo $_lang['export_site_prefix']; ?></b></td>
                  <td> </td>
                  <td><input type="text" name="prefix" value="<?php echo $friendly_url_prefix; ?>" /></td>
                </tr>
                <tr>
                  <td><b><?php echo $_lang['export_site_suffix']; ?></b></td>
                  <td> </td>
                  <td><input type="text" name="suffix" value="<?php echo $friendly_url_suffix; ?>" /></td>
                </tr>
                <tr>
                  <td valign="top"><b><?php echo $_lang['export_site_maxtime']; ?></b></td>
                  <td> </td>
                  <td><input type="text" name="maxtime" value="30" />
              		<br />
              		<small><?php echo $_lang['export_site_maxtime_message']; ?></small>
              	</td>
                </tr>
              </table>
              <p />
              <table cellpadding="0" cellspacing="0">
              	<td id="Button1" onclick="document.exportFrm.submit();"><img src="media/images/icons/save.gif" align="absmiddle"> <?php echo $_lang["export_site_start"]; ?></td>
              		<script>createButton(document.getElementById("Button1"));</script>
              </table>
              </form>
              </fieldset>
              
              <?php
              } else {
              
              	$maxtime = $_POST['maxtime'];
              	if(!is_numeric($maxtime)) {
              		$maxtime = 30;
              	}
              	
              	@set_time_limit($maxtime);
              	$mtime = microtime(); $mtime = explode(" ",$mtime); $mtime = $mtime[1] + $mtime[0]; $exportstart = $mtime; 
              	
              	$filepath = "../assets/export/";
              	if(!is_writable($filepath)) {
              		echo $_lang['export_site_target_unwritable'];
              		include "footer.inc.php";
              		exit;
              	}
              	
              	$prefix = $_POST['prefix'];
              	$suffix = $_POST['suffix'];
              
              	$noncache = $_POST['includenoncache']==1 ? "" : "AND $dbase.".$table_prefix."site_content.cacheable=1";
              	
              	$sql = "SELECT id, alias, pagetitle FROM $dbase.".$table_prefix."site_content WHERE $dbase.".$table_prefix."site_content.deleted=0 AND $dbase.".$table_prefix."site_content.published=1 AND $dbase.".$table_prefix."site_content.type='document' $noncache";
              	$rs = mysql_query($sql);
              	$limit = mysql_num_rows($rs);
              	printf($_lang['export_site_numberdocs'], $limit);
              	
              	for($i=0; $i<$limit; $i++) {
              		
              		$row=mysql_fetch_assoc($rs);
              		
              		$id = $row['id'];
              		printf($_lang['export_site_exporting_document'], $i, $limit, $row['pagetitle'], $id);
              		$alias = $row['alias'];
              		
              		$filename = !empty($alias) ? $prefix.$alias.$suffix : $prefix.$id.$suffix ;
              		
              		// get the file
              		if(@$handle = fetchDocument($id)) {
              			$buffer = "";
              			while (!feof ($handle)) {
              			   $buffer .= fgets($handle, 4096);
              			}
              			fclose ($handle);
              		
              			// save it
              			$filename = "$filepath$filename";
              			$somecontent = $buffer;
              			
              			if(!$handle = fopen($filename, 'w')) {
              				 echo $_lang['export_site_failed']." Cannot open file ($filename)<br />";
              				 exit;
              			} else {
              				// Write $somecontent to our opened file.
              				if(fwrite($handle, $somecontent) === FALSE) {
              				   echo $_lang['export_site_failed']." Cannot write file.<br />";
              				   exit;
              				}
              				fclose($handle);
              			echo $_lang['export_site_success']."<br />";	
              			}
              		} else {
              			echo $_lang['export_site_failed']." Could not retrieve document.<br />";
              		}
              	}
              
              	$mtime = microtime(); $mtime = explode(" ",$mtime); $mtime = $mtime[1] + $mtime[0]; $exportend = $mtime; 
              	$totaltime = ($exportend - $exportstart);
              	printf ("<p />".$_lang['export_site_time'], round($totaltime, 3));
              ?>
              <p />
              <table cellpadding="0" cellspacing="0">
              	<td id="Button2" onclick="reloadTree();"><img src="media/images/icons/cancel.gif" align="absmiddle"> <?php echo $_lang["close"]; ?></td>
              		<script>createButton(document.getElementById("Button2"));</script>
              </table>
              <?php
              }
              ?>
              


              What I did is really simple, I just add my own handy function to fetch website, which I got from php.net, and modified it accordingly, then I use this function to fetch the site, instead of fopen.
              I hope it will solve your auto appended querystring on your link.

              PS: I haven’t check it yet, but if you can test this out for me, it will be awesome. If it works, I might think to occupied this code and make it a little bit more advanced and usable.

              Sincerely,
                Wendy Novianto
                [font=Verdana]PT DJAMOER Technology Media
                [font=Verdana]Xituz Media
                • 25819
                • 10 Posts
                Thanks, Djamoer.

                The script did not work on my servers. I think it’s because none of them was compiled with CURL. But the idea is great. Actually, since I have to write a perl script to export the MODx site, can you shed some lights about how to do it in perl?




                  • 32241
                  • 1,495 Posts
                  Quote from: ryanc at Feb 24, 2006, 05:36 PM

                  Thanks, Djamoer.

                  The script did not work on my servers. I think it’s because none of them was compiled with CURL. But the idea is great. Actually, since I have to write a perl script to export the MODx site, can you shed some lights about how to do it in perl?

                  I thought so, that’s one of the weaknesses though, not all server support curl.
                  ryanc, sorry I couldn’t help you with perl. I code in Java, C, C#, C++, and I just learn PHP for a few months wink
                  You might have reference to the curl functionality in perl as well I believe. The whole logic is very2 simple, all you need is this code right here
                  <?php
                  $noncache = $_POST['includenoncache']==1 ? "" : "AND $dbase.".$table_prefix."site_content.cacheable=1";	
                  $sql = "SELECT id, alias, pagetitle FROM $dbase.".$table_prefix."site_content WHERE $dbase.".$table_prefix."site_content.deleted=0 AND $dbase.".$table_prefix."site_content.published=1 AND $dbase.".$table_prefix."site_content.type='document' $noncache";
                  ?>
                  


                  For accesing the doc, you can use this address http://domain.com/1.html, http://domain.com/2.html, and etc. So the number is actually the document id. Don’t forget to have base href defined on your header and SEF option is all turned on, so you have the right links being generated on the page.

                  Now you need to find similar function such as curl that will browse the site just like regular visitor and fetch the whole output and write it to html file on the right folder and with the right naming as well. All of them had been fetch by alias, and you need to defined the prefix and suffix for those as well.

                  PS: I might give a wrong presumption, but hopefully it will give you a better picture on how to start coding your perl script.
                    Wendy Novianto
                    [font=Verdana]PT DJAMOER Technology Media
                    [font=Verdana]Xituz Media
                    • 25819
                    • 10 Posts
                    Thanks, Djamoer/Wendy. I got the Perl scripts working, but still with the link parameters. I use a LWP object, but don’t know if there is anything equivalent to Curl. It’s google time again...

                    • Just found this on a google search: http://drupal.org/node/21170

                      I know it’s Drupal, but it does seem to be the same problem as you’re having, you could try adding the line mentioned in that post at the start of the export script in MODx and see if that helps (not sure where/how you’d add it for Perl):

                      ini_set('url_rewriter.tags', '');


                      Hope that helps, Garry
                        Garry Nutting
                        Senior Developer
                        MODX, LLC

                        Email: [email protected]
                        Twitter: @garryn
                        Web: modx.com