How to scrape text from another site and have it appear on mine?

8 Posts

thumbsucker Reply #1, 18 years, 2 months ago

Hi,

I’d like to scrape some text off a government web site (which is legal) and display it in a page on my site. Their data is dynamic so it would have to be some sort of real-time scrape.

I was curious if you guys would know the easiest way to get this done?

1 Posts

Zombie Reply #2, 18 years, 2 months ago

Here use this snippet to grab a URL.
# Usage:
# [ [Grab?wurl=full_URL_no_quotes] ]

function GetPageData($pageurl) { return(file_get_contents($pageurl)); }

function GrabPage($pageurl) {
$pageurl = GetPageData($pageurl);
$pagecontent = $pageurl;
return($pagecontent);
}

$html = GrabPage($wurl);
return ($html);

3,250 Posts

MARKSVIRTUALDESK Reply #3, 18 years, 2 months ago

@chanh: Hey! Thats my snippet! wink

Thanks for digging that up Chanh.

@thumbsucker: If you don’t plan on editing the data the comes in (thats why its in functions ATM) you could just use return(file_get_contents($pageurl)); as a snippet with pageurl being the page you want to get in "" s.

1 Posts

Zombie Reply #4, 18 years, 2 months ago

I found it in my sandbox and share but wonder where I got it!

Mark, thanks for claiming it!

356 Posts

bugsmi0 Reply #5, 18 years, 1 month ago

I tried to use the grab snippet to pull in phplist, pulled the subscribe page but when you click on subscribe it goes back to default home doc instead of continuing with the subscribe,

would be cool to have an effective function that can pull in things like phplist, forums, etc without the IFRAME, seems like a serverside include sort of function.

one challenge in modfiying another site to match the look of the modx side is you’ll have to mannual create text menu links since usually the modx menus are pulled in via a snippet or two.

SMF Bookmark Mod - check it out
http://mods.simplemachines.org/index.php?mod=350