scrapy-proxynova

Use scrapy with a list of proxies generated from proxynova.com

The first run will generate the list of proxies from http://proxynova.com and store it in the cache.

It will individually check each proxy to see if they work and remove the ones that timed out or cannot connect to.

Example:

./run_example.sh

To regenerate the proxy list, run: python proxies.py

In settings.py add the following line: DOWNLOADER_MIDDLEWARES = { 'scrapy_proxynova.middleware.HttpProxyMiddleware': 543 }

Options

Set these options in the settings.py.

PROXY_SERVER_LIST_CACHE_FILE — a file to store proxies list. Default: proxies.txt.
PROXY_BYPASS_PERCENT — probability for a connection to use a direct connection and not use a proxy

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
example-project		example-project
scrapy_proxynova		scrapy_proxynova
.gitignore		.gitignore
README.md		README.md
run_example.sh		run_example.sh
setup.py		setup.py