8000 GitHub - yi-refs/collector-http: Norconex HTTP Collector is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Norconex HTTP Collector is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.

License

Notifications You must be signed in to change notification settings

yi-refs/collector-http

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Norconex HTTP Collector

Norconex HTTP Collector

Norconex HTTP Collector is a full-featured web crawler (or spider) that can manipulate and store collected data into a repositoriy of your choice (e.g. a search engine). It very flexible, powerful, easy to extend, and portable. Can be used command-line with file-based configuration on any OS, or can be embedded into Java applications using well documented APIs.

Visit the web site for binary downloads and documentation:

About

Norconex HTTP Collector is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 97.7%
  • HTML 1.7%
  • Other 0.6%
0