8000 GitHub - ropfoo/clobbopus: Create mock data for web scrapers.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ropfoo/clobbopus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STILL WIP Tool for downloading and serving html files. Mainly useful for acting as a mock server for scraping services.

Motivation

Web scraping can be harmful when things go wrong. When developing a web scraper it is not uncommon to send tons of requests to a server that is not under your control. This might be because of a bug or just the nature of development, especially in a team. Instead of flooding external servers with requests in development - we should flood our own.

Features

  • download external pages by params
  • serve them on a local server

🐙 Usage

Run the follwing commands in order

download and install

curl -L https://github.com/ropfoo/clobbopus/releases/download/Latest/install.sh -o install.sh && bash install.sh

create a config yml file called clobbopus.yml

port: 3000 # default
dist: pages # default
domains:
  sample:
    url: "www.sample.com/results/"
    params:
      - result/with?query=test&page=~1-7~ # range from page=1 to page=7
      - result/without

get the initial page 6150 data

./clobbopus_data

run the server

./clobbopus_server

🐳 Docker

Dockerfile

FROM alpine:latest
RUN apk update
RUN apk add bash
RUN apk add curl
WORKDIR /app
RUN curl -L https://github.com/ropfoo/clobbopus/releases/download/Latest/install.sh -o install.sh && bash install.sh
# update with your path
COPY ./clobbopus/clobbopus.yml .
CMD ./clobbopus_data; ./clobbopus_server

docker-compose.yml

version: "3.8"
services:
  clobbopus:
    build:
      context: .
      dockerfile: ./clobbopus/Dockerfile
    ports:
      - 3000:3000
    volumes:
      - ./clobbopus:/app/clobbopus

About

Create mock data for web scrapers.

Resources

Stars

Watchers

Forks

Packages

No packages published
0