Python Web Crawler

Python Web Crawler

A Python Reimplementation of PHP Web Crawler. Cleaner code, more efficient and faster.

  • Language: Python
  • Released: Mar 28, 2011
    Last Update: Jun 25, 2011

For customized crawling and scraping services check out Crawley Cloud

Python Web Crawler is a program that searches for links on the web and save them in a MySql data base.


  • Multi-processed crawling to improve speed
  • MySql database to save the links
  • Easy to extend
  • Clean and readable Pythonic code
  • Url validator via regular expressions

Here's more information about it:

Here's the original PHP web crawler this is based on.


Getting Started

Tested on ubuntu 10.10


apt-get install python-MySQLdb 


To configure the crawler do edit the config.ini file. I.E:

host = localhost
user = root
pass = root
db = testDB

start_urls =,,
max_depth = 1
log = 1

The connection section indicates the common connection configuration to a Mysql DB.

The params section contain:

  • START_URLS: A list of urls (must be the complete url!. Don't forget to indicate http:// or https:// whichever is applicable) to start the crawl. The list must be separated by commas.

  • MAXDEPTH: The depth to crawl. 0 only crawls the start urls. 1 crawls the starturls and all the urls inside the given urls. 2 All the urls inside the urls given by previous and so on… Warning: A factor of 3 or greater can take for hours, days, month or years!

  • LOG: Indicates if the application shows the crawled urls in the console.


~$ python
You need to log-in or create an account
  • Create an account
  • Log-in
Please use your real name.
Activation link will be sent to this address.
Minimum 8 characters
Enter your password again

Clicking this button confirms you read and agreed to the terms of use and privacy policy.


Save your watchlist

Fill your details below to receive project updates from your watch list - including new versions, price changes and discounts.

I agree to the terms of use and privacy policy.

2 licenses, starting from From » $9.99 14 day money-back guarantee View Licenses
or Get a quote

for customization or integration services

Post a comment

Or enter your name and Email
  • AS Adil Sheikh 2 years ago
    Looks like a great piece of software - do you have a demo that I can view. Thanks !