A Roadmap for Navigating the Life Sciences Linked Open Data Cloud
Refereed Conference Meeting Proceeding
Life Sciences Linked Open Data (LOD) Cloud currently comprises multiple datasets that add high value to biomedical research. The ability to navigate through these datasets in order to derive and discover new meaningful biological correlations is considered one of the most relevant resources for the future of personalized medicine and the discovery of biological correlations in general. However, navigating these multiple datasets is not easy as most of them are available only as isolated SPARQL endpoints. There is an utmost desire by the researchers, practitioners and pharmaceutical industry workers to make use of these linked datasets to improve the drug discovery and development process. With the standardization of SPARQL 1.1, and its support for federated queries, it became possible to assemble queries that retrieve data from multiple SPARQL endpoints simultaneously. However, in order to match data from multiple endpoints, it is first necessary to understand which data exist in each endpoint and how that data can be queried. We have devised an active roadmap for navigating the linked life sciences cloud that illustrates all the possible “roads” or “links” between concepts in the LOD cloud. The methodology for roadmap identification relied on retrieving all “types” (concepts) from sixty Life Sciences related SPARQL endpoints and all properties associated with instances of each of those types. The entities collected (query elements) were then weaved together using three different approaches for concept and property matching: syntactic matching, semantic matching and domain matching. Our approach, if generalized to encompass other domains, can be used for road-mapping the entire LOD cloud.
UL - NUI Galway Alliance Second Annual Research Day - Book of Abstracts
Digital Object Identifer (DOI):
Open access repository: