This repository contains a setup script for transforming existing JSON datasets following the QADO process of automatic RML-driven data transformation to an RDF representation. The result of the process is stored as a release of this repository. It doesn’t include the validation of the SPARQL queries.
For the transformation process from JSON to RDF, the following services originating from the QADO initiative are used:
-
QADO Question Answering RDFizer (basic JSON benchmark to RDF mapping)
-
QADO SPARQL Query Analyser (extending SPARQL query objects with statistics)
-
QADO SPARQL Query Validator (validation of the results of the included SPARQL queries)
Additionally, GraphDB instance is used temporarily for storing QADO data, accessible during build process at http://localhost:7200.
sequenceDiagram
participant Host
participant RDFizer
participant SPARQLQueryAnalyser
participant SPARQLProcessor
participant SPARQLQueryValidator
Host ->> RDFizer: Convert JSON to RDF
RDFizer -->> Host: Response with RDF triples
Host ->> GraphDB: Store triples to DB
GraphDB -->> Host: Storing finished
Host -) SPARQLQueryAnalyser: Request SPARQL query statistics generation
SPARQLQueryAnalyser ->> GraphDB: Fetch SPARQL queries
GraphDB -->> SPARQLQueryAnalyser: Response with all SPARQL queries
SPARQLQueryAnalyser ->> SPARQLProcessor: Process SPARQL queries
SPARQLProcessor ->> GraphDB: Upload additional properties
GraphDB -->> SPARQLProcessor: Upload finished
SPARQLProcessor -->> SPARQLQueryAnalyser: Processing finished
SPARQLQueryAnalyser --) Host: SPARQL query statistics generation finished
Host ->> SPARQLQueryValidator: Validating SPARQL queries
SPARQLQueryValidator ->> GraphDB: store validation timestamps
GraphDB -->> SPARQLQueryValidator: timestamps stored
SPARQLQueryValidator -->> Host: SPARQL query validation finished
Host ->> GraphDB: Export full dataset
GraphDB -->> Host: QADO dataset as RDF triples
Inside the datasets
directory all tested benchmarks are provided that can be integrated into the
QADO dataset. If you want to add additional benchmarks, provided a valid
RDFizer payload as a new
JSON file.
git clone https://github.com/WSE-research/QADO-deploy-prefilled-triplestore.git
./deploy.sh
The script generates a ZIP file qado-benchmark.zip
containing the full dataset (full-qado.ttl
) and all supported
benchmarks as separated files in a subdirectory named datasets
. If you run the script with the parameter --validate
you will validate the results of all included SPARQL queries. This behaviour is disabled per default to increase the
build time.
Feel free to contribute via a fork and a pull request to this repository. You also might create an issue.