Translator Prototypes

NCATS Biomedical Data Translator prototypes registry and documentation.

See the official Translator prototypes registry at https://ncatstranslator.github.io

Standard Reasoner Implementations (SRI)#

Services to explore and validate implementations against the Translator standards.

ReasonerAPI#

ReasonerAPI

Standard recommended when serving an API in the Translator project. It consists in a JSON model for graph data, and allow to query and get answer from Translator APIs.

KGX#

kgx

KGX (Knowledge Graph Exchange) is a Python library and set of command line utilities for exchanging Knowledge Graphs (KGs) that conform to or are aligned to the Biolink Model.

Access#

NodeNormalization Service#

NodeNormalization

EdgeNormalization Service#

EdgeNormalization

NameResolution#

NameResolution

BioLink Model Lookup service#

BioLink Lookup

BLCompliance#

BLCompliance

Knowledge Providers (KP)#

COHD Clinical Data2Services Provider#

COHD

  • Columbia Open Health Data (COHD)
    • Clinical associations mined from observational EHR data
    • Conditions, drugs, procedures, gender, race, ethnicity
    • EHR prevalence and Co-occurrence count
    • Associations calculated from EHR prevalence and co-occurrence count
    • Privacy protection measures

Data2Services documentation

  • Data2Services
    • A framework and Command Line Interface for building and deploying Translator data and services in a reproducible manner.
    • Documentation and tools to transform your data to a BioLink-compliant RDF knowledge graph
    • Automatically deploy a Reasoner API over a BioLink-compliant RDF triplestore
    • Deploy additional interfaces to explore the knowledge graph
Access#
Deploy#
pip install d2s cwlref-runner
d2s init my-project

Docker must be installed.


MolePro Molecular Data Provider#

MolePro

A Molecular Data Provider translating molecular scale to systems scale through a Reasoner API.

Access#

Genetic Knowledge Provider#

Genetic Knowledge Provider

A tool to curate genetic associations for complex diseases, interpret their biological effects, and make these data available to the Translator.

Access#

ICEES+ KP Exposure Provider#

ICEES+ KP Exposure Provider

  • Patient data + environmental exposures data
  • Integrated at patient- and visit-level
  • UNC Health Care System (UNCHCS) + NIEHS Environmental Polymorphisms Registry (EPR)
  • Observational EHR data, EPR survey data, SNP data, exposures data
  • Available for years 2010 โ€“ 2016
Access#
Deploy#
git clone https://github.com/NCATS-Tangerine/icees-api.git
cd icees-api
# Edit .env
docker-compose up --build

Text Mining Provider#

Text Mining Provider roadmap

Up-to-date, BioLink-compatible, knowledge graph composed of assertions mined from the available full-text biomedical literature using high-performance text mining systems


Connections Hypothesis Provider#

  • Access Heterogeneous Data
    • Researcher clinical data
    • Knowledge captured in the Biomedical Data Translator project
  • Automate Source Selection
  • Effective Question-Response Ranking
  • Actionable Information

DOCKET multiomics provider#

DOCKET multiomics provider

  • Big GIM (Gene Interaction Miner), function interaction data for all pairs of genes. Functional interaction data are available from four different sources: 1) tissue-specific gene expression correlations from healthy tissue samples (GTEx), 2) tissue-specific gene expression correlations from cancer samples (TCGA), 3) tissue-specific probabilities of function interaction (GIANT), and 4) direct interactions (BioGRID).

  • Big CLAM (Cell Line Association Miner), integrates large-scale high-quality data of various cell line resources to uncover associations between genomic and molecular features of cell lines, drug response measurements and gene knockdown viability scores. The cell line data comes from five different sources: 1) CCLE - Cancer Cell Line Encyclopedia, 2) GDSC - Genomics of Drug Sensitivity in Cancer, 3) CTRP - Cancer Therapeutics Response Portal, 4) CMap - Connectivity Map, and 5) CDM - Cancer Dependency Map.

Access#
Deploy#

Documentation and integration to d2s started here.

d2s start docket

BioThings API#

BioThings API

Build and deploy BioThings APIs from flat data files.

  • API-fy knowledge sources on demand
  • Use BioThings SDK in Python to download and parse input data sources
  • Integrate your API to a meta-KG using Smart API
Access#
Deploy#

Documentation and integration to d2s started here. See the BioLink Studio documentation.

d2s start biothings-studio

Autonomous Relay Agent (ARA)#

BioThings Explorer#

BioThings Explorer

Federated querying of BioThings APIs, done in 2 steps:

  • Build a query path plan defining APIs relevant to answer the query
  • Execute the query path plan to retrieve data from the different APIs.
Access#

ARAGORN#

Autonomous Relay Agent for Generation Of Ranked Networks. A tool to query Knowledge Providers (KPs) and synthesize highly ranked answers relevant to user-specified questions

  • operate in a federated knowledge environment
  • bridge the precision mismatch between data specificity in KPs and more abstract level of user queries
  • generalize answer ranking
Tools#

QuestionRewrite

AnswerCoalesce

AnswerCoalesce


ARAX#

RTX

Team Expander Agent: A tool for enhancing query graphs. ARAX exposes all graph reasoning capabilities within a domain specific language: ARAXi. ARAX is a tool for querying, manipulating, filtering, learning on, and exploring biomedical knowledge graphs.

Access#

Explanatory ARA#

  • Analogical reasoning engine

  • Ranking results through explanations

    • Explanatory evidence via NLU model
    • Explanatory evidence via other methods
    • Explaining information vs. explaining decisions
  • Visualization of biomedical context

  • Open-world learning


mediKanren 2.0#

mediKanren

Based on the miniKanren logic programming language for reasoning over Knowledge Graoph (SemMedDB).

Access#
Deploy#

(im)proving agent#

PSEV

  • SPOKE: a biomedical knowledge metagraph (~25 sources)
    • reasoning to support facts from empirical evidences (Electronic Health Records, multi-omics studies)

evidARA

  • EvidARA
    • takes query from ARS and extracts a graph q (output graph) from its internal Knowledge Network (SPOKE)
    • checks empirical evidence from raw data of cohorts (EHR and multi-omics studies)
Access#
Deploy#

ARS#

ARS registry: https://ars.transltr.io/ars/app/status


Additional Translator resources#

Jupyter Notebook to combine data from various Knowledge Providers, produced during the Relay Days.

Last updated on by Vincent Emonet