Store RDF data
Store RDF data in a triplestore accessible by querying a SPARQL endpoint.
#
Publish to our public GraphDB triplestoreCreate a new repository on our GraphDB triplestore at https://graphdb.dumontierlab.com/
Ask for permissions
Ask us to get the permissions to create new repositories after creating an account.
#
Create the GraphDB repository👩💻 Go to Setup > Repositories > Create Repository
- Or click here: https://graphdb.dumontierlab.com/repository/create
👨💻 Choose the settings of your repository (leave the default if not mentioned here):
Ruleset
: use RDFS-Plus (Optimized) by default, or a OWL ruleset if you are performing reasoning using OWL ontologiesSupports SHACL validation
: enable if you plan on using SHACL shapes to validate the RDF loaded in the repository.- Visit https://maastrichtu-ids.github.io/shapes-of-you to find SHACL Shapes
- Add new shapes to IDS Shapes repository: https://github.com/MaastrichtU-IDS/shacl-shapes
Use context index
: enable this to index the contexts (aka. graphs)- For large dataset:
Entity index size
: increase this to 999999999Entity ID bit-size
: increase this to 40
To access your repository:
- SPARQL endpoint at https://graphdb.dumontierlab.com/repositories/my-repository
- SPARQL endpoint to run update queries (e.g.
INSERT
): https://graphdb.dumontierlab.com/repositories/my-repository/statements - GraphDB admin web UI: https://graphdb.dumontierlab.com and change the repository using the button at the top right of the screen.
#
Edit your repository accessBy default your repository will not be available publicly.
👩💻 Go to Users and Access
- Change the Free Access Settings (top right of the page) to enable public access to read the SPARQL endpoint of your repository
- Find your repository and enable Read access (checkbox on the left)
- You can also give Write access to other users
- We usually give Write access to the
import_user
to be used in automated workflow (to automatically upload new data to the repository)
- We usually give Write access to the
#
Optional: enable GraphDB search indexYou can easily enable GraphDB Lucene search index to quickly search string in your triplestore
Here is an example to create a search index for the rdfs:label
and dct:description
properties.
👨💻 Running this in your GraphDB repository SPARQL editor will insert the triples and the search index will be created (this might take some time). Feel free to edit the predicates indexed.
Query the GraphDB search index:
Wildcard
We are using a *
wildcard at the end to match all strings starting with the string TEXT_TO_SEARCH
#
List of RDF triplestores#
Ontotext GraphDBOntotext GraphDB™ triplestore includes a web UI, various data visualizations, OntoRefine, SHACL validation, RDFS/OWL reasoning to infer new triples and the possibility to deploy multiple repositories. It uses mainly the rdf4j framework.
Download the zip file of the latest GraphDB standalone free version, and place it in the same folder as the Dockerfile
before building the image.
Access at http://localhost:7200/
See the official Ontotext GraphDB™ documentation and the source code for Docker images for more details.
Obtain a license for more features such as performance improvement, easy deployment using the official DockerHub image or distributed deployment on multiple nodes with Kubernetes.
GraphDB allow to perform bulk load on large files using a second container:
- Change the repository to be created and loaded in
workspace/graphdb/preload-config.ttl
(default:demo
) - Put the files to be loaded in
workspace/import/preload
📩 - Start
graphdb-preload
docker container
When the preload has completed, the graphdb-preload
container will stop, you can then copy the loaded repository from workspace/graphdb/preload-data/repositories
to the running GraphDB folder:
And access the newly loaded repository in the running GraphDB instance without downtime.
See additional d2s documentation about setting up GraphDB
#
VirtuosoOpenLink Virtuoso triplestore. Available on DockerHub.
Access at http://localhost:8890/ and SPARQL endpoint at http://localhost:8890/sparql.
Admin username:
dba
CORS can be enabled following those instructions. See our complete Virtuoso documentation for more details.
Clear the Virtuoso triplestore using this command:
#
BlazegraphA high-performance RDF graph database. See its documentation for Docker.
It uses mainly the rdf4j framework.
UID and Group ID needs to be set in order to have the right permission to bulk load a file (example given for Ubuntu). And
RWStore.properties
can be rewritten, see example.
Access UI at http://localhost:8082/bigdata
SPARQL endpoint at http://localhost:8080/bigdata/sparql (original port)
To clear the graph go to the update tab and enter clear all
Follow those instructions to enable CORS on Blazegraph SPARQL endpoint.
#
Jena FusekiFuseki is a SPARQL server on top of Apache TDB RDF store, for single machines. It uses mainly the Jena framework.
Access at http://localhost:3030
Bulk load files in demo
dataset from workspace/import
(container needs to be stopped):
If you don't specify any filenames to
load.sh
, all filenames directly under/staging
that match these GLOB patterns will be loaded:
#
StardogRequires to download the free license first, then place it in the folder shared with Stardog.
See the official Stardog documentation for Docker. A JavaScript wrapper is available to communicate with Stardog API and SPARQL endpoint.
Access at http://localhost:5820, volume shared at
workspace/stardog
#
AllegroGraphAllegroGraph® is a modern, high-performance, persistent graph database. It supports SPARQL, RDFS++, and Prolog reasoning from numerous client applications.
Access at http://localhost:10035
Default login:
test
/xyzzy
See official documentation for bulk load.
TODO: fix shared volumes
#
AnzoGraphAnzoGraph® DB by Cambridge Semantics. See its official documentation to deploy with Docker.
Unregistered Free edition limited to 8G RAM, single user and single node deployment.
Register to access the 16G single node deployment for free.
Deploy AnzoGraph on multi-server cluster for horizontal scaling with the Enterprise Edition 💶
Access at http://localhost:8086
Default login:
admin
/Passw0rd1
.
Kubernetes deployment available using Helm.
#
Linked Data Fragments ServerTechnically not a triplestore, server supporting the Memento protocol to timestamped SPARQL querying over multiple linked data sources, e.g. HDT or SPARQL.
HDT archives goes in
workspace/hdt-archives
and the config file is inworkspace/ldfserver-config.json
Access at http://localhost:8085
#
Property graphs#
Neo4jNot supporting RDF, Neo4j is a property graph database. It uses Cypher as query language.
Access at http://localhost:7474, volume shared at
workspace/neo4j
Login with
neoj4
/neo4j
and change the password.virtu
#
Additional triplestores#
MarkLogicLicensed RDF triplestore 📜
Follow the GitHub Docker instructions to deploy it.
You will need to download the MarkLogic Server 📥
#
RDFoxLicensed RDF triplestore 📜
RDFox is a in-memory triplestore only supporting triples. RDFox is a main-memory, scalable, centralized data store that allows users to efficiently manage graph-structured data represented according to the RDF data model, run reasoning engines, and query that data using the SPARQL 1.1 query language.
See the documentation to deploy it using docker.