Start services
Run services such as triplestores, to store your RDF knowledge graph, interfaces or web UI to access the triplestore data. A specific deployment config can be passed using the -d
flag.
Volumes of all containers started by
d2s
are shared in theworkspace/
folder.
d2s
uses docker-compose to run the different services 🐳
In this documentation we will use a set of services to build the knowledge graph and access it using various interfaces.
#
List of servicesThe services deployments are defined in the d2s-core/docker-compose.yml file.
Start the services described below using:
#
🔗 Graph databasesSee the detailed lists of available graph databases.
- graphdb: commercial triplestore with a web UI and multiple repositories
- virtuoso: Open Source triplestore with a faceted browser
- blazegraph: Open Source lightweight triplestore
- fuseki: Open Source SPARQL server built on top of Apache Jena and TDB.
- allegroGraph: commercial triplestore
- anzoGraph: commercial triplestore
- ldf-server: Open Source Linked Data Fragments server, store and query compressed HDT files
- neo4j: commercial property graph database
#
🖥️ InterfacesSee the detailed lists of available interfaces.
- biothings-studio: web UI to build and deploy BioThings APIs
- into-the-graph: SPARQL web browser leveraging HCLS metadata, with YASGUI editor
- api: HTTP Open API with Swagger UI to query a RDF triplestore, accept ReasonerStd queries
- comunica: widget to query heterogeneous interfaces (SPARQL, HDT) using Comunica SPARQL and GraphQL
#
🔧 UtilitiesSee the detailed lists of RDF utilities.
- notebook: JupyterLab with template Notebooks to build and query the triplestore.
- spark-notebook: all Spark JupyterLab to process data.
- docket: multiomics tool for dataset overview, comparison and knowledge extraction using Jupyter notebooks.
- rmlstreamer: Apache Flink to process RML mappings
- rmltask: dependency of the rmlstreamer, the 2 services are required to run
- drill: exposes tabular text files (CSV, TSV, PSV) as SQL using Apache Drill
- postgres: popular Open Source SQL database
- limes: server to perform interlinking between RDF entities using various metrics
- nanobench: web UI to publish Nanopublications
- mapeathor: converts Excel files into RML or YARRRML mappings
#
Start demoDifferent solutions can used as final triplestore, here we will use Ontotext GraphDB as final triplestores for the Knowledge Graph. From our experience GraphDB is more stable and faster performing federated queries, additionally it offers a user-friendly administration.
GraphDB needs to be downloaded for licensing reason, provide your address and you will receive an email with the URL to download the GraphDB standalone zip file (graphdb-free-9.1.1-dist.zip
).
To easily install GraphDB, we recommend you to place it in your
home
folder before runningd2s init
, it is the default when the path to the GraphDB zip file is asked.
Start services required to run data transformation demonstration workflows: GraphDB triplestore, Apache Drill and Virtuoso as temporary triplestore.
⚠️ GraphDB might fail to start if not enough resources are available. We recommend raising the resources limit for Docker, and stopping resource-intensive apps, such as Slack, VSCode, Skype.
- Access the into-the-graph browser for GraphDB at http://localhost:8079
- Access the HTTP Swagger API at http://localhost:8080
- Access GraphDB at http://localhost:7200
- Access the temporary Virtuoso at http://localhost:8890
If you use Blazegraph or Virtuoso as final triplestore, you will need to activate CORS request to allow communication between the into-the-graph browser and the triplestore on your browser.
An add-on to enable CORS can be easily installed for Firefox or Chrome.
#
Use a deployment configServices can be started with a specific deployment config. This enables to define variable and docker parameters for a specific deployment in a complementary YAML file in the d2s-core/deployments folder.
See the deployments/trek.yml config as example, the following parameters are usually defined in deployment config:
- the service public URL (nginx Virtual Host)
- different Docker image tag for a service (to use different version)
- password
- resources limitations
Start services with a deployment config:
Feel free to define a new deployment config if your services requires different parameters than the one defined in the main docker-compose.yml
#
Manage services#
Show running services#
Stop all services#
Stop specific services#
Show running workflowsYou can get process information about running workflows, such as its process ID.
#
Stop running workflowAutocomplete will show only the PID of running workflows.
If autocomplete doesn't work, retrieve the PID using
d2s process-running