8000 GitHub - Folcky/reuters: python web scraper for http://feeds.reuters.com/reuters/topNews
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Folcky/reuters

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reuters scrapper

N|Solid

Requirements

  • Docker version: 18.06.1-ce
  • HDD free space: XGB
  • Internet connection

Project files

docker-compose.yml
File Description
run.sh Build docker compose file and start interactive mode
Dockerfile Dockerfile for Ubuntu server with Postgres-10 & Python 2.7
docker compose file with our app
scripts directory with scripts to work with Reuters, Postgres, data
scripts/create_schema.sql sql file to create schema
scripts/export_data.py python file to export data to csv file
scripts/get_data.py python file to scrape RSS data from Reuters http://feeds.reuters.com/reuters/topNews
scripts/lib.py python file to check PostgreSQL schema
scripts/save_data.py python file to load data to PostgreSQL
scripts/crontask.sh Schedule scrape and save to PostgreSQL
scripts/menu.sh Bash menu for user as Entry point

Install & Run

Go to the directory with the project files. Let's imagine that we are in Ubuntu. Run in terminal $ cd reuters

Build & and run our app with run.sh file

$ bash run.sh

Interact with reuters app menu

After running we can see bash menu in terminal:

$
1) Run PostgreSQL
2) Stop PostgreSQL
3) Create schema
4) Get some data from Reuters
5) Save data to PostgreSQL
6) Export data to CSV
7) Cron hourly
7) Quit
Please enter your choice: 

Type for example "1" without quotes to start PostgreSQL server than wait for starting.

Please, note that points 3 and 5 work only if you have started PostgreSQL

Please, note that point 6 works only if you have scraped data with points 4 & 5

Web interface for PostgreSQL

Go to your browser: http://localhost/browser/

Web interface default credentials are:

user: example@example.com
password: example

PostgreSQL server default credentials are:

host: dev
user: postgres
password: postgres

Todos

License

For fun

About

python web scraper for http://feeds.reuters.com/reuters/topNews

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0