8000 GitHub - szczeles/shop-inventory-analysis
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

szczeles/shop-inventory-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Shop Inventory Analysis

Quickstart

  1. Download input files: coding_challenge_inventory.csv, coding_challenge_meta.csv and coding_challenge_prices.csv into etl/inputs/.
  2. Run docker compose up -d.
  3. Navigate to http://127.0.0.1:8000/docs in order to see the API docs and execute requests.

ETL

ETL process is implemented using Jupyter Notebook. While it's not the usual way how the production-grade software is implemented, it's the perfect tool for R&D activities on data. Notebook is executed on the input data stored in etc/inputs/ in headless mode using papermill. Resulting notebook is converted to html and saved a etl/outputs/output.html together with the results (as CSVs) in the same directory. Generated notebook describes the processing approach and assumptions for inputs.

Loading into DB (Postgres 17) is conducted via etl/db-init.sql script that creates the tables, and then loads generated data inside.

In order to browse loaded data one can use psql shell executed directly on the container:

$ docker compose exec db psql -U postgres

ETL tests

There is only one test, called 01_end_to_end. It contains sample inputs with all products types (regular, one with variant, one with case, one with both alternates) and data issues found in input files (UPC duplicates and malformed supplier) in order to test the logic behaviour and avoid regression. The test executes notebook on inputs and compares outputs with expected ones (manually curated). In order to run test:

  1. Create v 5FC5 irtualenv and activate it.
  2. Install runtime and test dependencies: pip install -r requirements.txt -r requirements-dev.txt
  3. Run tests: pytest tests/

API

API implementation is based on FastAPI with SQLAlchemy. Code resides in api/ directory inside products_api module. There are only 3 files:

  • main.py -> REST API implementation, REST models and documentation
  • models.py -> database models, autogenerated from Postgres using sqlacodegen
  • repository.py -> products repository with business logic of products fetching

API Tests

  1. Create virtualenv and activate it.
  2. Install runtime and test dependencies: pip install -r requirements.txt -r requirements-dev.txt
  3. Run tests: pytest tests/

All the tests are executed on push to GitHub via Github Actions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0