8000 GitHub - WSE-research/QADO-datasets: This repository is used only for the deploying a triplestore to our demos server using data from the RDFizer web service
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

This repository is used only for the deploying a triplestore to our demos server using data from the RDFizer web service

License

Notifications You must be signed in to change notification settings

WSE-research/QADO-datasets

Repository files navigation

QADO datasets

This repository contains a setup script for transforming existing JSON datasets following the QADO process of automatic RML-driven data transformation to an RDF representation. The result of the process is stored as a release of this repository. It doesn’t include the validation of the SPARQL queries.


Process

For the transformation process from JSON to RDF, the following services originating from the QADO initiative are used:

Additionally, GraphDB instance is used temporarily for storing QADO data, accessible during build process at http://localhost:7200.

sequenceDiagram
    participant Host
    participant RDFizer
    participant SPARQLQueryAnalyser
    participant SPARQLProcessor
    participant SPARQLQueryValidator

    Host ->> RDFizer: Convert JSON to RDF
    RDFizer -->> Host: Response with RDF triples
    Host ->> GraphDB: Store triples to DB
    GraphDB -->> Host: Storing finished
    Host -) SPARQLQueryAnalyser: Request SPARQL query statistics generation
    SPARQLQueryAnalyser ->> GraphDB: Fetch SPARQL queries
    GraphDB -->> SPARQLQueryAnalyser: Response with all SPARQL queries
    SPARQLQueryAnalyser ->> SPARQLProcessor: Process SPARQL queries
    SPARQLProcessor ->> GraphDB: Upload additional properties
    GraphDB -->> SPARQLProcessor: Upload finished
    SPARQLProcessor -->> SPARQLQueryAnalyser: Processing finished
    SPARQLQueryAnalyser --) Host: SPARQL query statistics generation finished
    Host ->> SPARQLQueryValidator: Validating SPARQL queries
    SPARQLQueryValidator ->> GraphDB: store validation timestamps
    GraphDB -->> SPARQLQueryValidator: timestamps stored
    SPARQLQueryValidator -->> Host: SPARQL query validation finished
    Host ->> GraphDB: Export full dataset
    GraphDB -->> Host: QADO dataset as RDF triples
Loading

Configurations of the datasets

Inside the datasets directory all tested benchmarks are provided that can be integrated into the QADO dataset. If you want to add additional benchmarks, provided a valid RDFizer payload as a new JSON file.

Run deployer

Step 1: Clone the repository

git clone https://github.com/WSE-research/QADO-deploy-prefilled-triplestore.git

Step 2: Run the deployment script

./deploy.sh

The script generates a ZIP file qado-benchmark.zip containing the full dataset (full-qado.ttl) and all supported benchmarks as separated files in a subdirectory named datasets. If you run the script with the parameter --validate you will validate the results of all included SPARQL queries. This behaviour is disabled per default to increase the build time.

Contributions

Feel free to contribute via a fork and a pull request to this repository. You also might create an issue.

About

This repository is used only for the deploying a triplestore to our demos server using data from the RDFizer web service

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  
0