A web-based tool for matching and labeling building footprints from two datasets. Runs locally as a Flask app, displaying Folium-generated maps of potential matching pairs.
pip install git+https://github.com/ai4up/eubucco-conflation@main
Step 1: Create a dataset of potential building pairs from two datasets:
gmatch create-labeling-dataset dataset1.parquet dataset2.parquet
Step 2: Start browser-based labeling of building pairs:
gmatch label
Create a dataset of potential matches of government buildings and Microsoft buildings for a small region in France that require manual labeling using the demo data in the repository. Include only buildings which overlap slightly (0-10%).
gmatch create-labeling-dataset \
--min-intersection=0.1 \ # Minimum relative overlap for new buildings to be included in labeling dataset [0,1)
--max-intersection=0.2 \ # Maximum relative overlap for new buildings to be included in labeling dataset (0,1]
data/demo-gov.parquet data/demo-microsoft.parquet
The resulting dataset is locally stored as candidate-pairs.pickle
. To initiate the browser-based labeling, run:
gmatch label
Prerequisites
- Ensure a dataset of candidate pairs (
candidate-pairs.pickle
) is present in thedata
directory.- For production deployment, set a Flask session
SECRET_KEY
environment variable.
Serve the dockerized Flask app with an Nginx proxy at http://localhost:80
:
docker-compose up
Install dev dependencies using poetry:
poetry install --only dev
Install git pre-commit hooks:
pre-commit install
Build from source:
poetry build
pip install dist/geo_matcher-*.whl