ZSPIDER

a distributed spider system

Components

  • dispatcher
    dispatch center : auto detect to work.
  • crawler
    crawler daemon : to process the crawl task
  • web
    a web site : to manage this system.

Resource Dependencis

rabbitmq, mongodb, memcached

Notice

Docs are writing, but not that quick.

This is ready for use. There are several resources to be prepared and configured to use.

Mind those source file containing conf in the filename. mainly: conf.py, crawl_conf.py, dispatcher_conf.py, web_conf.py

The web user isn’t finish yet. see www/handlers/__init__.py