A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://pubmed.ncbi.nlm.nih.gov/33482803 below:

A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses

doi: 10.1186/s12915-020-00940-y. A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses

Affiliations

Affiliations

Item in Clipboard

A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses

Andra Waagmeester et al. BMC Biol. 2021.

doi: 10.1186/s12915-020-00940-y. Affiliations

Item in Clipboard

Erratum in Abstract

Background: Pandemics, even more than other medical problems, require swift integration of knowledge. When caused by a new virus, understanding the underlying biology may help finding solutions. In a setting where there are a large number of loosely related projects and initiatives, we need common ground, also known as a "commons." Wikidata, a public knowledge graph aligned with Wikipedia, is such a commons and uses unique identifiers to link knowledge in other knowledge bases. However, Wikidata may not always have the right schema for the urgent questions. In this paper, we address this problem by showing how a data schema required for the integration can be modeled with entity schemas represented by Shape Expressions.

Results: As a telling example, we describe the process of aligning resources on the genomes and proteomes of the SARS-CoV-2 virus and related viruses as well as how Shape Expressions can be defined for Wikidata to model the knowledge, helping others studying the SARS-CoV-2 pandemic. How this model can be used to make data between various resources interoperable is demonstrated by integrating data from NCBI (National Center for Biotechnology Information) Taxonomy, NCBI Genes, UniProt, and WikiPathways. Based on that model, a set of automated applications or bots were written for regular updates of these sources in Wikidata and added to a platform for automatically running these updates.

Conclusions: Although this workflow is developed and applied in the context of the COVID-19 pandemic, to demonstrate its broader applicability it was also applied to other human coronaviruses (MERS, SARS, human coronavirus NL63, human coronavirus 229E, human coronavirus HKU1, human coronavirus OC4).

Keywords: COVID-19; Linked data; Open Science; ShEx; Wikidata.

PubMed Disclaimer

Conflict of interest statement

All authors have declared to have no competing interests.

Figures

Fig. 1

Structure of a Wikidata item,…

Fig. 1

Structure of a Wikidata item, containing a set of statements which are key-value…

Fig. 1

Structure of a Wikidata item, containing a set of statements which are key-value pairs, with qualifiers and references. Here the item for the angiotensin-converting enzyme 2 (ACE2) protein is given containing a statement about its molecular function. This molecular function (peptidyl-dipeptidase activity) contains a reference stating when and where this information was obtained

Fig. 2

Example of an RDF data…

Fig. 2

Example of an RDF data model representing ACE2, created with RDFShape [32]

Fig. 2

Example of an RDF data model representing ACE2, created with RDFShape [32]

Fig. 3

Overview of the ShEx schemas…

Fig. 3

Overview of the ShEx schemas and the relations between them. All shapes, properties,…

Fig. 3

Overview of the ShEx schemas and the relations between them. All shapes, properties, and items are available from within Wikidata

Fig. 4

Application of the drafted ShEx…

Fig. 4

Application of the drafted ShEx schemas in the EntitySchema extension of Wikidata allows…

Fig. 4

Application of the drafted ShEx schemas in the EntitySchema extension of Wikidata allows for confirmation if a set of on-topic items align with expressed expectations. In panel a, the application renders the Wikidata item invalid due to a missing reference which in turn does not conform to the expressed ShEx whereas in panel b, the item (Q88292589) conforms to the applied schema

Fig. 5

Screenshot of SARS-CoV-2 and COVID-19…

Fig. 5

Screenshot of SARS-CoV-2 and COVID-19 Pathway in WikiPathways ( wikipathways:WP4846 ) showing the…

Fig. 5

Screenshot of SARS-CoV-2 and COVID-19 Pathway in WikiPathways ( wikipathways:WP4846 ) showing the BridgeDb popup box for the ORF3a protein, showing a link out to Scholia via the protein and gene’s Wikidata identifiers

Fig. 6

Screenshot of the Scholia page…

Fig. 6

Screenshot of the Scholia page for the SARS-CoV-2 spike glycoprotein, it shows four…

Fig. 6

Screenshot of the Scholia page for the SARS-CoV-2 spike glycoprotein, it shows four articles that specifically discuss this protein

Fig. 7

Comparison of two Wikidata entries…

Fig. 7

Comparison of two Wikidata entries for the SARS-CoV-2 membrane protein. An overlap between…

Fig. 7

Comparison of two Wikidata entries for the SARS-CoV-2 membrane protein. An overlap between a Wikidata item and a concept from a primary source needs to have some overlap to allow automatic reconciliation. If there is no overlap, duplicates will be created and left for human inspection. Since this screenshot was made, the entries have been merged in a manually curation process

Fig. 8

Flow diagram for entity schema…

Fig. 8

Flow diagram for entity schema development and the executable workflow for the virus…

Fig. 8

Flow diagram for entity schema development and the executable workflow for the virus gene protein bot. a The workflow of creating shape expressions. b The computational workflow of how information was used from various public resources to populate Wikidata

Fig. 9

JavaScript Object notation output of…

Fig. 9

JavaScript Object notation output of the mygene.info output for gene with NCBI gene…

Fig. 9

JavaScript Object notation output of the mygene.info output for gene with NCBI gene identifier 43740571

Fig. 10

The UniProt SPARQL query used…

Fig. 10

The UniProt SPARQL query used to obtain additional protein annotations, descriptions, and external…

Fig. 10

The UniProt SPARQL query used to obtain additional protein annotations, descriptions, and external resources

Similar articles Cited by

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4