A comprehensive toolkit for analyzing the quality of open data metadata. Based on the European Data Portal's Metadata Quality Assessment (MQA) methodology.
This tool helps data publishers and consumers evaluate and improve the quality of metadata in open data catalogs. It analyzes metadata against the FAIR+C principles (Findability, Accessibility, Interoperability, Reusability, and Contextuality) and provides detailed reports on quality metrics.
- Quality Assessment: Evaluate metadata according to the MQA methodology
- API Integration: REST API for programmatic access to validation services
- Web Interface: User-friendly interface for non-technical users
- Historical Tracking: Store and visualize quality evolution over time
- SHACL Validation: Check compliance with DCAT-AP, DCAT-AP-ES, and NTI-RISP standards
The project consists of two main components:
- API: FastAPI-based backend that validates metadata and generates reports
- Frontend: Streamlit-based web interface for visualizing reports
-
Clone the repository:
git clone https://github.com/your-organization/metadata-quality-stack.git cd metadata-quality-stack
-
Start the services using Docker Compose:
docker-compose up
-
Clone the repository:
git clone https://github.com/your-organization/metadata-quality-stack.git cd metadata-quality-stack
-
Install dependencies:
pip install -e .
-
Start the API:
uvicorn src.api.main:app --host 0.0.0.0 --port 8000
-
Start the frontend (in a separate terminal):
streamlit run src/frontend/app.py
For development, we recommend using VS Code with the Dev Container configuration provided:
- Install the VS Code Remote - Containers extension
- Open the project in VS Code
- Click on "Reopen in Container" when prompted
- Wait for the container to build and configure
After updating the translation file (mqa.po
), don't forget to compile it to generate the .mo
file, e.g spanish:
cd metadata-quality-stack
msgfmt -o locale/es/LC_MESSAGES/mqa.mo locale/es/LC_MESSAGES/mqa.po
The API provides the following endpoints:
POST /validate
: Validate metadata from a URLPOST /validate-content
: Validate metadata directly provided as contentGET /report/{url}
: Get the latest report for a URLGET /history/{url}
: Get the history of reports for a URLGET /reports/by-date
: Get reports within a specified date rangeGET /reports/by-rating/{rating}
: Get reports with a specific quality rating
All API endpoints are documented with OpenAPI and can be explored at /docs
when the API is running.
The web interface offers:
-
Validation Options:
- Enter a URL to a catalog ()
RDF/XML
,TTL
,JSON-LD
andN3
formats) - Paste RDF content directly for validation
- Select different compliance profiles (DCAT-AP, DCAT-AP-ES, NTI-RISP)
- Enter a URL to a catalog ()
-
Visualization Features:
- Hierarchical chart showing dimension and metric relationships
- Radar chart displaying performance across FAIR+C dimensions
- Detailed metrics breakdown with counts and percentages
-
Report Management:
- View historical reports and track quality evolution over time
- Export reports in both JSON and JSON-LD (DQV vocabulary) formats
- Score evolution charts for long-term quality tracking
-
Analytics Dashboard:
- Overview statistics of catalogs evaluated
- Distribution of quality ratings
- Comparison of dimension averages
- Top and bottom performing catalogs
- Dimension correlation analysis
-
Multilingual Support:
- Toggle between English and Spanish interfaces
- Localized metric descriptions and labels