PHP Web Crawler

PHP Web Crawler

Released 7 years ago , Last update 7 years ago

A simple, fast crawler that collects URLs from HTML pages.

For customized crawling and scraping services check out Crawley Cloud

PHP Web Crawler is a software that searches for links in the web. It stores the links and some extra data in a database and shows them as HTML output.

Features:

  • The crawler can be run as multiple instances
  • It can be run by a cron job.
  • Crawl results are saved in a MySQL database. It generates the table "urls" to store the crawls.
  • For each url it saves the url of source, the url of the destiny and the anchor text. - Validates the urls via a regular expression. It avoids the links to static data into the site. Including the unnecessary media files. Despite this I can't ensure that the crawler avoids all the media files. That be more complex to validate.

Here's a tutorial about PHP Web Crawler

There's also a Python Web Crawler available.

Pricing

$9.99

Personal License

  • Perpetual license

  • 1 site, unlimited servers

  • No distribution (hosted use only)

  • Commercial use

PHP Web Crawler

See the follow to get started with the PHP web crawler:

http://codescience.wordpress.com/2011/02/15/php-web-crawler/

Installation / UnPacking

PHP Web crawler can run in any directory. But if you want use the Web UI please set it in a directory that can be served by the apache web server.

Dependencies:

  • Apache2
  • php5
  • php5-mysql 

Warning: Ensure that apache have the permision to write  the config.ini file. Else you can do ~$chmod 777 config.ini (all the permises) on a unix like system.

Related Projects

Recently I started a Huge project!. It's a Crawling / Scraping framework written in Python.

It's totally open source and was realead under the GPL v3 license.

The repository is at github

And there's also a project website

Checkout it for free!

2 licenses, starting from From » $9.99 View Licenses

Get A Quote

What do you need?
  • Custom development
  • Integration
  • Customization / Reskinning
  • Consultation
When do you need it?
  • Soon
  • Next week
  • Next month
  • Anytime

Thanks for getting in touch!

Your quote details have been received and we'll get back to you soon.


Or enter your name and Email
  • N Neo 9 months ago
    Do you still support this product?
  • PR Prasanna Rao 1 year ago
    I wnt to build a search enge how do i strt with crawler
  • KC Kyle Choi 4 years ago
    You mentioned as the below, What are "some extra data " ? Keywords & description of the page? Please let me know. software that searches for links in the web. It stores the links and some extra data in a database and shows them as HTML outpu