Translator Prototypes

NCATS Biomedical Data Translator prototypes registry and documentation.

See the official Translator prototypes registry at https://ncatstranslator.github.io

Standard Reasoner Implementations (SRI)#

Services to explore and validate implementations against the Translator standards.

ReasonerAPI#

Standard recommended when serving an API in the Translator project. It consists in a JSON model for graph data, and allow to query and get answer from Translator APIs.

KGX#

KGX (Knowledge Graph Exchange) is a Python library and set of command line utilities for exchanging Knowledge Graphs (KGs) that conform to or are aligned to the Biolink Model.

Knowledge Providers (KP)#

COHD Clinical Data2Services Provider#

Columbia Open Health Data (COHD)
- Clinical associations mined from observational EHR data
- Conditions, drugs, procedures, gender, race, ethnicity
- EHR prevalence and Co-occurrence count
- Associations calculated from EHR prevalence and co-occurrence count
- Privacy protection measures

Data2Services
- A framework and Command Line Interface for building and deploying Translator data and services in a reproducible manner.
- Documentation and tools to transform your data to a BioLink-compliant RDF knowledge graph
- Automatically deploy a Reasoner API over a BioLink-compliant RDF triplestore
- Deploy additional interfaces to explore the knowledge graph

Access#

COHD Reasoner API at http://cohd.io/api
Data2Services documentation at https://d2s.semanticscience.org
Data2Services Reasoner API over BioLink RDF at http://api.trek.semanticscience.org (see on GitHub)
Into-the-graph web UI to browse a BioLink RDF triplestore leveraging metadata and services at http://trek.semanticscience.org
GitHub repositories for Data2Services project template and command line interface.

Deploy#

pip install d2s cwlref-runner
d2s init my-project

Docker must be installed.

MolePro Molecular Data Provider#

A Molecular Data Provider translating molecular scale to systems scale through a Reasoner API.

Access#

Open API: https://translator.broadinstitute.org/molecular_data_provider/api
Reasoner API: https://smart-api.info/ui/912372f46127b79fb387cd2397203709#

Genetic Knowledge Provider#

A tool to curate genetic associations for complex diseases, interpret their biological effects, and make these data available to the Translator.

Access#

Reasoner API: https://translator.broadinstitute.org/genetics_data_provider/query

ICEES+ KP Exposure Provider#

Patient data + environmental exposures data
Integrated at patient- and visit-level
UNC Health Care System (UNCHCS) + NIEHS Environmental Polymorphisms Registry (EPR)
Observational EHR data, EPR survey data, SNP data, exposures data
Available for years 2010 – 2016

Access#

ICEES+ Open APIs
- API KP: http://icees.renci.org:16339/apidocs
- API UNC: https://icees.renci.org:16335/apidocs
- API NIEHS: https://icees.renci.org:16336/apidocs
- API Duke: https://icees.renci.org:16337/apidocs
- Prototype ICEES+ Wireframe UI: http://robokop.renci.org:3001
TranQL:
- Web UI: https://tranql.renci.org
- Sample TranQL queries
Components
- FHIR-PIT: https://github.com/NCATS-Tangerine/FHIR-PIT
- Secure Multiparty Computation: https://github.com/RENCI-NRIG/impact-smc
- Machine learning: https://github.com/NCATS-Tangerine/iceesnn

Deploy#

git clone https://github.com/NCATS-Tangerine/icees-api.git
cd icees-api
# Edit .env
docker-compose up --build

Text Mining Provider#

Up-to-date, BioLink-compatible, knowledge graph composed of assertions mined from the available full-text biomedical literature using high-performance text mining systems

Connections Hypothesis Provider#

Access Heterogeneous Data
- Researcher clinical data
- Knowledge captured in the Biomedical Data Translator project
Automate Source Selection
Effective Question-Response Ranking
Actionable Information

DOCKET multiomics provider#

Big GIM (Gene Interaction Miner), function interaction data for all pairs of genes. Functional interaction data are available from four different sources: 1) tissue-specific gene expression correlations from healthy tissue samples (GTEx), 2) tissue-specific gene expression correlations from cancer samples (TCGA), 3) tissue-specific probabilities of function interaction (GIANT), and 4) direct interactions (BioGRID).
Big CLAM (Cell Line Association Miner), integrates large-scale high-quality data of various cell line resources to uncover associations between genomic and molecular features of cell lines, drug response measurements and gene knockdown viability scores. The cell line data comes from five different sources: 1) CCLE - Cancer Cell Line Encyclopedia, 2) GDSC - Genomics of Drug Sensitivity in Cancer, 3) CTRP - Cancer Therapeutics Response Portal, 4) CMap - Connectivity Map, and 5) CDM - Cancer Dependency Map.

Access#

Reasoner API: http://biggim.ncats.io/api
Running instructions
Big GIM II API: https://github.com/gloriachin/BigGIMII_API

Deploy#

Documentation and integration to d2s started here.

d2s start docket

BioThings API#

Build and deploy BioThings APIs from flat data files.

API-fy knowledge sources on demand
Use BioThings SDK in Python to download and parse input data sources
Integrate your API to a meta-KG using Smart API

Access#

Translator KP APIs powered by BioThings SDK: https://biothings.ncats.io
BioThings SDK
- Docs: https://docs.biothings.io
- PyPI package: https://pypi.org/project/biothings
SmartAPI: https://smart-api.info
- GitHub: https://github.com/SmartAPI/smartAPI
- Up-to-date Meta-KG: https://smart-api.info/registry/translator/meta-kg

Example Disease KP API: https://biothings.ncats.io/DISEASES
- Knowledge source: https://diseases.jensenlab.org
- GitHub: https://github.com/kevinxin90/DISEASES
Up-to-date Meta-KG: https://smart-api.info/registry/translator/meta-kg
Service KP milestone dashboard: https://github.com/orgs/biothings/projects/5
BioThings Studio
- GitHub: https://github.com/biothings/biothings_studio
- Docs: https://docs.biothings.io/en/latest/doc/studio.html

Deploy#

Documentation and integration to d2s started here. See the BioLink Studio documentation.

d2s start biothings-studio

Autonomous Relay Agent (ARA)#

BioThings Explorer#

Federated querying of BioThings APIs, done in 2 steps:

Build a query path plan defining APIs relevant to answer the query
Execute the query path plan to retrieve data from the different APIs.

Access#

BioThings Explorer UI demo: https://biothings.io/explorer
Docs: https://biothings_explorer.readthedocs.io/en/latest
Jupyter Notebooks on gitHub

ARAGORN#

Autonomous Relay Agent for Generation Of Ranked Networks. A tool to query Knowledge Providers (KPs) and synthesize highly ranked answers relevant to user-specified questions

operate in a federated knowledge environment
bridge the precision mismatch between data specificity in KPs and more abstract level of user queries
generalize answer ranking

Tools#

Question Augmentation
- Open API: https://questionaugmentation.renci.org/apidocs
- Example Notebooks:
  - https://github.com/ranking-agent/QuestionRewrite/blob/master/documentation/QuestionAugmentationSimilarity_strider.ipynb
  - https://github.com/ranking-agent/QuestionRewrite/blob/master/documentation/QuestionAugmentationEdges.ipynb

Answer Coalescence
- Open API: https://answercoalesce.renci.org/apidocs/
- Example Notebook: https://github.com/ranking-agent/AnswerCoalesce/blob/master/documentation/AnswerCoalescence.ipynb

ReasonerStdAPI Message Jupyter Notebook visualizer: https://github.com/ranking-agent/gamma-viewer

ARAX#

Team Expander Agent: A tool for enhancing query graphs. ARAX exposes all graph reasoning capabilities within a domain specific language: ARAXi. ARAX is a tool for querying, manipulating, filtering, learning on, and exploring biomedical knowledge graphs.

Access#

ARAX
RTX-KG2
- RTX-KG2 Neo4j UI: http://kg2canonicalized2.rtx.ai:7474
- RTX-KG2 API: https://smart-api.info/ui/00bab7d59abe031098d5cb1597f7f1c4
- GitHub: https://github.com/RTXteam/RTX-KG2

Explanatory ARA#

Analogical reasoning engine
Ranking results through explanations
- Explanatory evidence via NLU model
- Explanatory evidence via other methods
- Explaining information vs. explaining decisions
Visualization of biomedical context
Open-world learning

mediKanren 2.0#

Based on the miniKanren logic programming language for reasoning over Knowledge Graoph (SemMedDB).

Access#

MediKanren BioLink interface on GitHub

Deploy#

See documentation on GitHub.

(im)proving agent#

SPOKE: a biomedical knowledge metagraph (~25 sources)
- reasoning to support facts from empirical evidences (Electronic Health Records, multi-omics studies)

EvidARA
- takes query from ARS and extracts a graph q (output graph) from its internal Knowledge Network (SPOKE)
- checks empirical evidence from raw data of cohorts (EHR and multi-omics studies)

Access#

SPOKE
- GitHub: https://github.com/baranzini-lab/PSEV
evidARA
- GitHub: https://github.com/brettasmi/evidARA

Deploy#

SPOKE on neo4j. See documentation on GitHub.

ARS#

ARS registry: https://ars.transltr.io/ars/app/status

Additional Translator resources#

Jupyter Notebook to combine data from various Knowledge Providers, produced during the Relay Days.

Standard Reasoner Implementations (SRI)#

ReasonerAPI#

KGX#

Access#

NodeNormalization Service#

EdgeNormalization Service#

NameResolution#

BioLink Model Lookup service#

BLCompliance#

Knowledge Providers (KP)#

COHD Clinical Data2Services Provider#

Access#

Deploy#

MolePro Molecular Data Provider#

Access#

Genetic Knowledge Provider#

Access#

ICEES+ KP Exposure Provider#

Access#

Deploy#

Text Mining Provider#

Connections Hypothesis Provider#

DOCKET multiomics provider#

Access#

Deploy#

BioThings API#

Access#

Deploy#

Autonomous Relay Agent (ARA)#

BioThings Explorer#

Access#

ARAGORN#

Tools#

ARAX#

Access#

Explanatory ARA#

mediKanren 2.0#

Access#

Deploy#

(im)proving agent#

Access#

Deploy#

ARS#

Additional Translator resources#