Grabbing content (aka. page scraping) from a website
As an alternative to using an XML feed (possibly if a website doesnt offer any feeds) you can use the following method to load the website into PHP & then grab certain content:
// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');
// Find all images
foreach($html->find('img') as $element)
echo $element->src . '<br>';
// Find all links
foreach($html->find('a') as $element)
echo $element->href . '<br>';
Note that this is not my original code, i've sourced this through google searches. Although it's definately handy so i wanted to share it with you all.
Thank you. In extension to this, I see many people use regular expressions to scrape content from websites, when really XPath is the way to be heading. Especially since DOMDocument provides a DOMXPath class.