PHP Crawler Script

Crawler script searches the url in any specified website through php in a fraction of seconds.

Features

  • Links can be crawled for the given websites.
  • Crawler searches the site and grabs all url.
  • Just specify the website name for what you want to get links.
  • Highly customizable and easy usability.

Preview

Php Web Crawler
Enter URL :
©h

Downloads


<?php
error_reporting(0);
//set_time_limit (0);
function crawl_page_info($url, $depth = 5){
$seen = array();
if(($depth == 0) or (in_array($url, $seen))){
return;
}
//$href;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
$result = curl_exec ($ch);
curl_close ($ch);
if( $result ){
$stripped_file = strip_tags($result, "<a>");
preg_match_all("/<a[\s]+[^>]*?href[\s]?=[\s\"\']+"."(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/", $stripped_file, $matches, PREG_SET_ORDER );
foreach($matches as $match){
$href = $match[1];
if (0 !== strpos($href, 'http')) {
$path = '//' . ltrim($href, '//');
//$path = $href;
if (extension_loaded('http')) {
$href = http_build_url($href , array('path' => $path));
} else {
$parts = parse_url($href);
$href = $parts['scheme'] . '://';
if (isset($parts['user']) && isset($parts['pass'])) {
$href .= $parts['user'] . ':' . $parts['pass'] . '@';
}
$href .= $parts['host'];
if (isset($parts['port'])) {
$href .= ':' . $parts['port'];
}
$href = $path;
}
}
crawl_page_info($href, $depth - 1);
echo $href;
echo "<br>";
echo "<br>";
}
}
}
$check_url = "http://statsmonkey.com/";
if(preg_match( '/^(http|https):\\/\\/[a-z0-9]+([\\-\\.]{1}[a-z0-9]+)*\\.[a-z]{2,5}'.'((:[0-9]{1,5})?\\/.*)?$/i' ,$check_url)){
crawl_page_info($check_url,3);

}
else{
echo("$check_url is not a valid URL");
}
?>
  • Release Date - 31-07-2015
  • Get free version without ©copyright link for just $10/-
  • For customization of this script or any script development, mail to support@hscripts.com

Usage

  • Copy and paste the above code given on the script.
  • The function crawl_page_info() is used to crawl the given url.
  • Php script crawls the url of the given websites.

License

Related Scripts

Free Php Scripts


Ask Questions

Ask Question