RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://www.ensembl.org/info/docs/api/variation/variation_schema.html below:

Schema Documentation

This document gives a high-level description of the tables that make up the Ensembl variation schema. Tables are listed by alphabetical order, and the purpose of each table is explained. It is intended to allow people to familiarise themselves with the schema when encountering it for the first time, or when they need to use some tables that they've not used before.

This document refers to version 113 of the Ensembl variation schema.

The variation database schema diagram (PDF format) is available here:

List of the tables:

Defines various attributes used elsewhere in the database

Column Type Default value Description Index attrib_id INT(11) - Primary key primary key attrib_type_id SMALLINT(5) 0 Key into the attrib_type table, identifies the type of this attribute unique key: type_val_idx value TEXT - The value of this attribute unique key: type_val_idx

See below the query to display a subset of the attrib entries:

SELECT * FROM attrib WHERE attrib_type_id IN (469,470,471) ORDER BY attrib_id LIMIT 21;

attrib_id attrib_type_id value 1 469 SO:0001483 2 470 SNV 3 471 SNP 4 469 SO:1000002 5 470 substitution 6 469 SO:0001019 7 470 copy_number_variation 8 471 CNV 9 469 SO:0000667 10 470 insertion 11 469 SO:0000159 12 470 deletion 13 469 SO:1000032 14 470 indel 15 469 SO:0000705 16 470 tandem_repeat 17 469 SO:0001059 18 470 sequence_alteration 19 469 SO:0001628 20 470 intergenic_variant 21 471 INTERGENIC

Groups related attributes together

Column Type Default value Description Index attrib_set_id INT(11) 0 Primary key unique key: set_idx attrib_id INT(11) 0 Key of an attribute in this set unique key: set_idx
key: attrib_idx

Defines the set of possible attribute types used in the attrib table

Column Type Default value Description Index attrib_type_id SMALLINT(5) 0 Primary key primary key code VARCHAR(20) '' A short codename for this type (indexed, so should be used for lookups) unique key: code_idx name VARCHAR(255) '' The name of this type description TEXT NULL Longer description of this type

See below the command to display a subset of the attrib_type entries:

SELECT * FROM attrib_type WHERE attrib_type_id > 468 LIMIT 10;

attrib_type_id code name description 469 SO_accession SO accession Sequence Ontology accession 470 SO_term SO term Sequence Ontology term 471 display_term display term Ensembl display term 472 NCBI_term NCBI term NCBI term 473 feature_SO_term feature SO term Sequence Ontology term for the associated feature 474 rank rank Relative severity of this variation consequence 475 polyphen_prediction polyphen prediction PolyPhen-2 prediction 476 sift_prediction sift prediction SIFT prediction 477 short_name Short name A shorter name for an instance, e.g. a VariationSet 478 dbsnp_clin_sig dbSNP/ClinVar clinical significance The clinical significance of a variant as reported by ClinVar and dbSNP

Contains alleles that did not pass the Ensembl filters

Column Type Default value Description Index failed_allele_id INT(11) - Primary key, internal identifier. primary key allele_id INT(10) - Foreign key references to the allele table. unique key: allele_idx failed_description_id INT(10) - Foreign key references to the failed_description table. unique key: allele_idx

This table contains descriptions of reasons for a variation being flagged as failed.

Column Type Default value Description Index failed_description_id INT(10) - Primary key, internal identifier. primary key description TEXT - Text containing the reason why the Variation has been flagged as failed. e.g. "Variation does not map to the genome".

See below the list of the descriptions available in the Ensembl variation databases:

SELECT * FROM failed_description;

failed_description_id description 1 Variant maps to more than 3 different locations 2 None of the variant alleles match the reference allele 3 Variant has more than 3 different alleles 4 Loci with no observed variant alleles in dbSNP 5 Variant does not map to the genome 6 Variant has no genotypes 7 Genotype frequencies do not add up to 1 8 Variant has no associated sequence 9 Variant submission has been withdrawn by the 1000 genomes project due to high false positive rate 11 Additional submitted allele data from dbSNP does not agree with the dbSNP refSNP alleles 12 Variant has more than 3 different submitted alleles 13 Alleles contain non-nucleotide characters 14 Alleles contain ambiguity codes 15 Mapped position is not compatible with reported alleles 16 Flagged as suspect by dbSNP 17 Variant can not be re-mapped to the current assembly 18 Supporting evidence can not be re-mapped to the current assembly 19 Variant maps to more than one genomic location 20 Variant at first base in sequence 21 Reference allele does not match the bases at this genome location 22 Alleles cannot be resolved

For various reasons it may be necessary to store information about a structural variation that has failed quality checks (mappings) in the Structural Variation pipeline. This table acts as a flag for such failures.

Column Type Default value Description Index failed_structural_variation_id INT(11) - Primary key, internal identifier. primary key structural_variation_id INT(10) - Foreign key references to the structural_variation table. unique key: structural_variation_idx failed_description_id INT(10) - Foreign key references to the failed_description table. unique key: structural_variation_idx

For various reasons it may be necessary to store information about a variation that has failed quality checks in the Variation pipeline. This table acts as a flag for such failures.

Column Type Default value Description Index failed_variation_id INT(11) - Primary key, internal identifier. primary key variation_id INT(10) - Foreign key references to the variation table. unique key: variation_idx failed_description_id INT(10) - Foreign key references to the failed_description table. unique key: variation_idx

For various reasons it may be necessary to store information about a variation feature that has failed quality checks. This table acts as a flag for such failures.

Column Type Default value Description Index failed_variation_feature_id INT - Primary key, internal identifier. primary key variation_feature_id INT - Foreign key references to the variation_feature table. unique key: variation_feature_idx failed_description_id INT - Foreign key references to the failed_description table. unique key: variation_feature_idx

This table holds genotypes compressed using the pack() method in Perl. These genotypes are mapped to particular genomic locations rather than variation objects. The data have been compressed to reduce table size and increase the speed of the web code when retrieving strain slices and LD data. Only data from resequenced samples are used for LD calculations are included in this table

Column Type Default value Description Index sample_id INT(10) - Foreign key references to the sample table. key: sample_idx seq_region_id INT(10) - Foreign key references seq_region in core db. ers to the seq_region which this variant is on, which may be a chromosome, a clone, etc... key: pos_idx seq_region_start INT(11) - The start position of the variation on the seq_region. key: pos_idx seq_region_end INT(11) - The end position of the variation on the seq_region. seq_region_strand TINYINT(4) - The orientation of the variation on the seq_region. genotypes BLOB NULL Encoded representation of the genotype data:
Each row in the compressed table stores genotypes from one individual/sample in one fixed-size region of the genome (arbitrarily defined as 100 Kb). The compressed string (using Perl's pack method) consisting of a repeating triplet of elements: a distance in base pairs from the previous genotype; a variation dbID; a genotype_code_id identifier.
For example, a given row may have a start position of 1000, indicating the chromosomal position of the first genotype in this row. The unpacked genotypes field then may contain the following elements:
0, 1, 1, 20, 2, 5, 35, 3, 3, ...
The first genotype ("0,1,1") has a position of 1000 + 0 = 1000, and corresponds to the variation with the internal identifier 1 and genotype_code_id corresponding to the genotype A|G (internal ID 1).
The second genotype ("20,2,5") has a position of 1000 + 20 = 1020, internal variation_id 2 and genotype_code_id corresponding to the genotype C|C ( internal ID 5).
The third genotype similarly has a position of 1055, and so on.

This table holds genotypes compressed using the pack() method in Perl. These genotypes are mapped directly to variation objects. The data have been compressed to reduce table size. All genotypes in the database are included in this table (included duplicates of those genotypes contained in the compressed_genotype_region table). This table is optimised for retrieval from variation.

Column Type Default value Description Index variation_id INT(11) - Foreign key references to the variation table. key: variation_idx subsnp_id INT(11) NULL Foreign key references to the subsnp_handle table. key: subsnp_idx genotypes BLOB NULL Encoded representation of the genotype data:
Each row in the compressed table stores genotypes from one subsnp of a variation (or one variation if no subsnp is defined). The compressed string (using Perl's pack method) consisting of a repeating pair of elements: an internal sample_id corresponding to a sample; a genotype_code_id identifier.

This table stores genotypes and frequencies for variations in given populations.

Column Type Default value Description Index population_genotype_id INT(10) - Primary key, internal identifier. primary key variation_id INT(11) - Foreign key references to the variation table. key: variation_idx subsnp_id INT(11) NULL Foreign key references to the subsnp_handle table. key: subsnp_idx genotype_code_id INT(11) NULL Foreign key reference to the genotype_code table. frequency FLOAT NULL Frequency of the genotype in the population. population_id INT(10) NULL Foreign key references to the population table. key: population_idx count INT(10) NULL Number of individuals/samples who have this genotype, in this population.

This table stores the read coverage of resequenced samples. Each row contains sample ID, chromosomal coordinates and a read coverage level.

Column Type Default value Description Index seq_region_id INT(10) - Foreign key references seq_region in core db. ers to the seq_region which this variant is on, which may be a chromosome, a clone, etc... key: seq_region_idx seq_region_start INT - The start position of the variation on the seq_region. key: seq_region_idx seq_region_end INT - The end position of the variation on the seq_region. level TINYINT - Minimum number of reads. sample_id INT(10) - Foreign key references to the sample table. key: sample_idx

This table holds uncompressed genotypes for given variations.

Column Type Default value Description Index variation_id INT(10) - Primary key. Foreign key references to the variation table. key: variation_idx subsnp_id INT(15) NULL Foreign key references to the subsnp_handle table. key: subsnp_idx allele_1 VARCHAR(25000) NULL One of the alleles of the genotype, e.g. "TAG". allele_2 VARCHAR(25000) NULL The other allele of the genotype. sample_id INT(10) NULL Foreign key references to the sample table. key: sample_idx

This table is only needed to create master schema when run healthcheck system. Needed for other species, but human, so keep it.

Column Type Default value Description Index variation_id INT(10) - Foreign key references to the variation table. key: variation_idx subsnp_id INT(15) NULL Foreign key references to the subsnp_handle table. key: subsnp_idx allele_1 char(1) NULL One of the alleles of the genotype, e.g. "TAG". allele_2 char(1) NULL The other allele of the genotype. sample_id INT(10) - Foreign key references to the sample table. key: sample_idx

This table stores various metadata relating to the database, generally used by the Ensembl web code.

This table gives the coordinate system used by various tables in the database.

This table stores the relationship between the internal allele identifiers and the alleles themselves.

Column Type Default value Description Index allele_code_id INT(11) - Primary key, internal identifier. primary key allele VARCHAR(60000) NULL String representing the allele. Has a unique constraint on the first 1000 characters (max allowed by MySQL). unique key: allele_idx

Stores information about the available co-ordinate systems for the species identified through the species_id field. Note that for each species, there must be one co-ordinate system that has the attribute "top_level" and one that has the attribute "sequence_level".

Column Type Default value Description Index coord_system_id INT(10) - Primary key, internal identifier. primary key species_id INT(10) 1 Identifies the species for multi-species databases. unique key: rank_idx
unique key: name_idx
key: species_idx name VARCHAR(40) - Co-oridinate system name, e.g. 'chromosome', 'contig', 'scaffold' etc. unique key: name_idx version VARCHAR(255) NULL Assembly. unique key: name_idx rank INT - Co-oridinate system rank. unique key: rank_idx attrib SET:

default_version
sequence_level

NULL Co-oridinate system attrib (e.g. "top_level", "sequence_level").

This table stores genotype codes as multiple rows of allele_code identifiers, linked by genotype_code_id and ordered by haplotype_id.

Column Type Default value Description Index genotype_code_id INT(11) - Internal identifier. key: genotype_code_id allele_code_id INT(11) - Foreign key reference to allele_code table. key: allele_code_id haplotype_id TINYINT(2) - Sorting order of the genotype's alleles. phased TINYINT(2) NULL Indicates if this genotype is phased

This table stores the relationship between Ensembl's internal coordinate system identifiers and traditional chromosome names.

Column Type Default value Description Index seq_region_id INT(10) - Primary key. Foreign key references seq_region in core db. Refers to the seq_region which this variant is on, which may be a chromosome, a clone, etc... primary key name VARCHAR(255) - The name of this sequence region. unique key: name_cs_idx coord_system_id INT(10) - Foreign key references to the coord_system table. unique key: name_cs_idx
key: cs_idx

This table holds a short string to distinguish data submitters

Column Type Default value Description Index handle_id INT(10) - Primary key, internal identifier. primary key handle VARCHAR(25) NULL Short string assigned to the data submitter. unique: key

This table contains the SubSNP(ss) ID and the name of the submitter handle of dbSNP.

Column Type Default value Description Index subsnp_id INT(11) - Primary key. It corresponds to the subsnp identifier (ssID) from dbSNP.
This ssID is stored in this table without the "ss" prefix. e.g. "120258606" instead of "ss120258606". primary key handle VARCHAR(20) NULL The name of the dbSNP handler who submitted the ssID.
Name of the synonym (a different sample_id).

This table stores details of the phenotypes associated with phenotype_features.

Column Type Default value Description Index phenotype_id INT(10) - Primary key, internal identifier. primary key stable_id VARCHAR(255) NULL Ensembl stable identifier for the phenotype key: stable_idx name VARCHAR(50) NULL Phenotype short name. e.g. "CAD". key: name_idx description VARCHAR(255) NULL varchar Phenotype long name. e.g. "Coronary Artery Disease". unique key: desc_idx class_attrib_id INT NULL Class of phenotype entry, eg trait, non_specified, tumour - used for filtering

This table stores information linking entities (variants, genes, QTLs) and phenotypes.

Column Type Default value Description Index phenotype_feature_id INT(11) - Primary key, internal identifier. primary key phenotype_id INT(11) NULL Foreign key references to the phenotype table. key: phenotype_idx source_id INT(11) NULL Foreign key references to the source table. key: source_idx study_id INT(11) NULL Foreign key references to the study table. type ENUM:

Gene
Variation
StructuralVariation
SupportingStructuralVariation
QTL
RegulatoryFeature

NULL Type of object associated. key: object_idx
key: type_idx object_id VARCHAR(255) NULL Stable identifier for associated object. key: object_idx is_significant TINYINT(1) '1' Flag indicating if the association is statistically significant in the given study. seq_region_id INT(11) NULL Foreign key references seq_region in core db. Refers to the seq_region which this feature is on, which may be a chromosome, a clone, etc... key: pos_idx seq_region_start INT(11) NULL The start position of the feature on the seq_region. key: pos_idx seq_region_end INT(11) NULL The end position of the feature on the seq_region. key: pos_idx seq_region_strand TINYINT(4) NULL The orientation of the feature on the seq_region.

This table stores additional information on a given phenotype/object association. It is styled as an attrib table to allow for a variety of fields to be populated across different object types.

Column Type Default value Description Index phenotype_feature_id INT(11) - Foreign key, references to the phenotype_feature table. key: phenotype_feature_idx attrib_type_id INT(11) NULL Foreign key references to the attrib_type table. key: type_value_idx value VARCHAR(255) NULL The value of the attribute. key: type_value_idx

This table stores accessions of phenotype ontology terms which have been linked to phenotype.descriptions

Column Type Default value Description Index phenotype_id INT(11) - Foreign key, references to the phenotype table. primary key accession VARCHAR(255) - The accession of an ontology term held in the ontology database (eg. EFO:0000378) primary key
key: accession_idx mapped_by_attrib SET:

NULL The method used to annotate the phenotype.description with the ontology term mapping_type ENUM:

is
involves

NULL The relation defining the association between the ontology term and the phenotype.description

Contains encoded protein function predictions for every protein-coding transcript in this species.

Column Type Default value Description Index translation_md5_id INT(11) - Identifies the MD5 hash corresponding to the protein sequence to which these predictions apply primary key analysis_attrib_id INT(11) - Identifies the analysis (sift, polyphen etc.) that produced these predictions primary key prediction_matrix MEDIUMBLOB NULL A compressed binary string containing the predictions for all possible amino acid substitutions in this protein. See the explanation here

Contains information on the data use in protein function predictions

Column Type Default value Description Index translation_md5_id INT(11) - Identifies the MD5 hash corresponding to the protein sequence to which these data use in prediction apply primary key analysis_attrib_id INT(11) - Identifies the analysis (sift, polyphen etc.) that produced these values primary key attrib_type_id INT(11) - Key into the attrib_type table, identifies the type of this attribute primary key position_values BLOB NULL A compressed binary string containing data relevant to the quality of the predictions

Maps a hex MD5 hash of a translation sequence to an ID used for the protein function predictions

Column Type Default value Description Index translation_md5_id INT(11) - Primary key primary key translation_md5 char(32) - Hex MD5 hash of a translation sequence unique key: md5_idx

Used to store groups of populations displayed separately on the Population Genetics page

Column Type Default value Description Index display_group_id INT(10) - Primary key, internal identifier. primary key display_priority INT(10) - Priority level for group (smallest number is highest on page) unique: key display_name VARCHAR(255) - Name of the group to be displayed as the table header. unique: key

Stores information about an identifiable individual, including gender and the identifiers of the individual's parents (if known).

Column Type Default value Description Index individual_id INT(10) - Primary key, internal identifier. primary key name VARCHAR(255) NULL Name of the individual. description TEXT NULL Description of the individual. gender ENUM:

Male
Female
Unknown

'Unknown' The sex of this individual. father_individual_id INT(10) NULL Self referential ID, the father of this individual if known. key: father_individual_idx mother_individual_id INT(10) NULL Self referential ID, the mother of this individual if known. key: mother_individual_idx individual_type_id INT(10) 0 Foreign key references to the individual_type table.

Used to store alternative names for individuals when data comes from multiple sources.

Column Type Default value Description Index synonym_id INT(10) - Primary key, internal identifier. primary key individual_id INT(10) - Foreign key references to the individual table. key: individual_idx source_id INT(10) - Foreign key references to the source table. key: name, source_id name VARCHAR(255) NULL Name of the synonym. key: name, source_id

This table gives a detailed description for each of the possible individual types: fully_inbred, partly_inbred, outbred, mutant

Column Type Default value Description Index individual_type_id INT(0) - Primary key, internal identifier. primary key name VARCHAR(255) - Short name of the individual type. e.g. "fully_inbred","mutant". description TEXT NULL Long name of the individual type.

See below the list of individual types:

SELECT * FROM individual_type;

individual_type_id name description 1 fully_inbred multiple organisms have the same genome sequence 2 partly_inbred single organisms have reduced genome variability due to human intervention 3 outbred a single organism which breeds freely 4 mutant a single or multiple organisms with the same genome sequence that have a natural or experimentally induced mutation

Stores information about a population. A population may be an ethnic group (e.g. Caucasian, Hispanic), assay group (e.g. 24 Europeans), phenotypic group (e.g. blue eyed, diabetes) etc. Populations may be composed of other populations by defining relationships in the population_structure table.

Column Type Default value Description Index population_id INT(10) - Primary key, internal identifier. primary key name VARCHAR(255) NULL Name of the population. key: name_idx size INT(10) NULL Size of the population. description TEXT NULL Description of the population. collection TINYINT(1) 0 Flag indicating if the population is defined based on geography (0) or a collection of individuals/samples with respect to some other criteria (1). freqs_from_gts TINYINT(1) NULL Flag indicating if the population frequencies can be retrieved from the allele table (0) or from the individual/sample genotypes (1). display ENUM:

LD
MARTDISPLAYABLE
UNDISPLAYABLE

'UNDISPLAYABLE' Information used by BioMart. display_group_id TINYINT(1) NULL Used to group population for display on the Population Genetics page

This table stores hierarchical relationships between populations by relating them as populations and sub-populations.

Column Type Default value Description Index super_population_id INT(10) - Foreign key references to the population table. unique key: super_population_idx sub_population_id INT(10) - Foreign key references to the population table. unique key: super_population_idx
key: sub_population_idx

Used to store alternative names for populations when data comes from multiple sources.

Column Type Default value Description Index synonym_id INT(10) - Primary key, internal identifier. primary key population_id INT(10) - Foreign key references to the population table. key: population_idx source_id INT(10) - Foreign key references to the source table. key: name, source_id name VARCHAR(255) NULL Name of the synonym. key: name, source_id

Stores information about a sample. A sample belongs to an individual. An individual can have multiple samples. A sample can belong only to one individual. A sample can be associated with a study.

Column Type Default value Description Index sample_id INT(10) - Primary key, internal identifier. primary key individual_id INT(10) - Foreign key references to the individual table. key: individual_idx name VARCHAR(255) NULL Name of the sample. description TEXT NULL Description of the sample. study_id INT(10) NULL Foreign key references to the study table. key: study_idx display ENUM:

REFERENCE
DEFAULT
DISPLAYABLE
UNDISPLAYABLE
LD
MARTDISPLAYABLE

'UNDISPLAYABLE' Information used by the website: samples with little information are filtered from some web displays. has_coverage TINYINT(1) '0' Indicate if the sample has coverage data populated in the read coverage table variation_set_id SET:

NULL Indicates the variation sets for which a sample has genotypes

This table resolves the many-to-many relationship between the sample and population tables; i.e. samples may belong to more than one population. Hence it is composed of rows of sample and population identifiers.

Column Type Default value Description Index sample_id INT(10) - Foreign key references to the sample table. key: sample_idx population_id INT(10) - Foreign key references to the population table. key: population_idx

Used to store alternative names for samples when data comes from multiple sources.

Column Type Default value Description Index synonym_id INT(10) - Primary key, internal identifier. primary key sample_id INT(10) - Foreign key references to the sample table. key: sample_idx source_id INT(10) - Foreign key references to the source table. key: name, source_id name VARCHAR(255) NULL Name of the synonym. key: name, source_id

This table contains identifiers of associated studies (e.g. NHGRI and EGA studies with the same PubMed identifier).

Column Type Default value Description Index study1_id INT(10) - Primary key. Foreign key references to the study table. primary key study2_id INT(10) - Primary key. Foreign key references to the study table. primary key

This table contains details of publications citing variations. This information comes from dbSNP, UCSC and Europe PMC.

Column Type Default value Description Index publication_id INT(10) - Primary key, internal identifier. primary key title VARCHAR(400) NULL Title of the publication authors VARCHAR(255) NULL Authors of the publication pmid INT(10) NULL The PubMed id for the publication if available key: pmid_idx pmcid VARCHAR(255) NULL The PubMed Central id for the publication if available year INT(10) NULL The year the publication was published doi VARCHAR(80) NULL The DOI (Digital Object Identifier) for the publication key: doi_idx ucsc_id VARCHAR(50) NULL The external id used in the UCSC database and URL

This table contains details of the source from which a variation is derived. Most commonly this is NCBI's dbSNP; other sources include SNPs called by Ensembl.
You can see the complete list, by species, here.

Column Type Default value Description Index source_id INT(10) - Primary key, internal identifier. primary key name VARCHAR(24) - Name of the source. e.g. "dbSNP" unique key: name_idx version INT NULL Version number of the source (if available). e.g. "132" description VARCHAR(400) NULL Description of the source. url VARCHAR(255) NULL URL of the source. type ENUM:

chip
lsdb

NULL Define the type of the source, e.g. 'chip' somatic_status ENUM:

germline
somatic
mixed

'germline' Indicates if this source includes somatic or germline mutations, or a mixture data_types SET:

variation
variation_synonym
structural_variation
phenotype_feature
study

NULL Indicates the type(s) of data provided by the source

See below the command listing all the data sources in the human variation database:

SELECT * FROM source ORDER BY source_id;

source_id name version description url type somatic_status data_types 1 dbSNP 156 Variants (including SNPs and indels) imported from dbSNP http://www.ncbi.nlm.nih.gov/projects/SNP/ NULL mixed variation 2 Archive dbSNP 156 Former dbSNP variant names, merged by variant http://www.ncbi.nlm.nih.gov/projects/SNP/ NULL mixed variation_synonym 3 dbSNP HGVS 156 HGVS annotation from dbSNP http://www.ncbi.nlm.nih.gov/projects/SNP/ NULL mixed variation_synonym 4 Former dbSNP 156 Former dbSNP variant names, merged by allele http://www.ncbi.nlm.nih.gov/projects/SNP/ NULL mixed variation_synonym 5 PharmGKB 20240521 A pharmacogenomics knowledge resource https://www.pharmgkb.org/ NULL germline variation_synonym 8 HGMD-PUBLIC 20204 Variants from HGMD-PUBLIC dataset December 2020 http://www.hgmd.cf.ac.uk/ac/index.php NULL germline variation,phenotype_feature 11 DGVa 202001 Database of Genomic Variants Archive https://www.ebi.ac.uk/dgva/ NULL mixed structural_variation,phenotype_feature,study 13 NHGRI-EBI GWAS catalog 20240520 Variants associated with phenotype data from the NHGRI-EBI GWAS catalog https://www.ebi.ac.uk/gwas/ NULL germline phenotype_feature,study 15 EGA 202405 Variants imported from the European Genome-phenome Archive with phenotype association https://www.ebi.ac.uk/ega/ NULL germline study 16 UniProt 20240521 Variants with protein annotation imported from UniProt http://www.uniprot.org/ NULL mixed variation_synonym 18 OMIM 202404 Variants linked to entries in the Online Mendelian Inheritance in Man (OMIM) database [imported via ClinVar] http://www.omim.org/ NULL germline variation_synonym 20 Illumina_CytoSNP12v1 1 Variants from the Illumina Cyto SNP-12 v1 whole genome SNP genotyping chip designed for cytogenetic analysis http://www.illumina.com/ chip germline structural_variation 21 Illumina_Human660W-quad NULL Variants from the Illumina Human 660W-Quad whole genome SNP genotyping chip designed for association studies http://www.illumina.com/ chip germline structural_variation 22 Illumina_Human1M-duo 3 Variants from the Illumina Human 1M-Duo v3 whole genome SNP genotyping chip designed for association studies http://www.illumina.com/ chip germline structural_variation 23 Affy GenomeWideSNP_6 CNV NULL Copy Number Variation (CNV) probes from the Affymetrix Genome-Wide Human SNP Array 6.0 http://www.affymetrix.com/ chip germline structural_variation 26 COSMIC 99 Somatic mutations found in human cancers from the COSMIC catalogue https://cancer.sanger.ac.uk/cosmic/ NULL somatic variation,variation_synonym,phenotype_feature 32 ClinVar 202404 Variants of clinical significance imported from ClinVar https://www.ncbi.nlm.nih.gov/clinvar/ NULL germline variation,variation_synonym,phenotype_feature 33 DDG2P 20221021 Genotype-to-Phenotype Database https://www.ebi.ac.uk/gene2phenotype NULL germline phenotype_feature 34 GIANT 1 The Genetic Investigation of ANthropometric Traits (GIANT) consortium is an international collaboration that seeks to identify genetic loci that modulate human body size and shape, including height and measures of obesity. http://www.broadinstitute.org/collaboration/giant/index.php/Main_Page NULL germline phenotype_feature 35 MIM morbid 20240620 Online Mendelian Inheritance in Man (OMIM) database https://www.omim.org/ NULL germline phenotype_feature 36 Orphanet 20231204 The portal for rare diseases and drugs https://www.orpha.net/ NULL germline phenotype_feature 37 MAGIC 1 MAGIC (the Meta-Analyses of Glucose and Insulin-related traits Consortium) represents a collaborative effort to combine data from multiple GWAS to identify additional loci that impact on glycemic and metabolic traits http://www.magicinvestigators.org/ NULL germline phenotype_feature 39 HbVar NULL A Database of Human Hemoglobin Variants and Thalassemias http://globin.bx.psu.edu/hbvar/menu.html lsdb germline variation_synonym 40 LMDD NULL Leiden Muscular Dystrophy Database http://www.dmd.nl/ lsdb germline variation_synonym 46 dbGaP 201405 The database of Genotypes and Phenotypes. http://www.ncbi.nlm.nih.gov/gap NULL germline phenotype_feature,study 47 AMDGC 1 The AMD Gene consortium is an international collaboration that seeks to identify genetic loci associated with age-related macular degeneration . http://www.sph.umich.edu/csg/abecasis/public/amdgene2012/ NULL germline phenotype_feature,study 48 GEFOS 1 The GEnetic Factors for OSteoporosis Consortium international collaboration that seeks to identify common risk gene variants for osteoporosis http://www.gefos.org NULL germline phenotype_feature 50 Teslovich 1 Biological, clinical and population relevance of 95 loci for blood lipids http://www.sph.umich.edu/csg/abecasis/public/lipids2010/ NULL germline phenotype_feature 54 Cancer Gene Census 202406 Catalog of genes of which mutations have been causally implicated in cancer https://cancer.sanger.ac.uk/census NULL somatic phenotype_feature,study 55 dbVar 202310 NCBI database of human genomic structural variation https://www.ncbi.nlm.nih.gov/dbvar/ NULL mixed structural_variation,phenotype_feature,study 57 PhenCode NULL PhenCode is a collaborative project to better understand the relationship between genotype and phenotype in humans http://phencode.bx.psu.edu/ NULL germline variation_synonym 58 OIVD NULL Osteogenesis Imperfecta Variant Database https://oi.gene.le.ac.uk/home.php NULL germline variation_synonym 59 dbPEX NULL dbPEX, PEX Gene Database http://www.dbpex.org/home.php?select_db=PEX1 NULL germline variation_synonym 60 KAT6BDB NULL K(lysine) acetyltransferase 6B database, BCM https://grenada.lumc.nl/LOVD2/BCM/home.php?select_db=KAT6B NULL germline variation_synonym 61 Infevers NULL The registry of Hereditary Auto-inflammatory Disorders Mutations http://infevers.umai-montpellier.fr/web/ NULL germline variation_synonym 62 PAHdb NULL Phenylalanine hydroxylase database http://www.pahdb.mcgill.ca/ NULL germline variation_synonym 63 G2P 20240620 Genotype-to-Phenotype Database https://www.ebi.ac.uk/gene2phenotype NULL germline phenotype_feature

This table contains details of the studies. The studies information can come from internal studies (DGVa, EGA) or from external studies (UniProt, NHGRI, ...).

Column Type Default value Description Index study_id INT(10) - Primary key, internal identifier. primary key source_id INT(10) - Foreign key references to the source table. key: source_idx name VARCHAR(255) NULL Name of the study. e.g. "EGAS00000000001" description TEXT NULL Description of the study. url VARCHAR(255) NULL URL to find the study data (http or ftp). external_reference VARCHAR(255) NULL The PubMed/id or project name associated with this study. key: external_reference_idx study_type VARCHAR(255) NULL Displays the type of the study (e.g. genome-wide association study, control-set, case-set, curated, ...).

This table contains descriptions of group submitting data to public repositories such as ClinVar

Column Type Default value Description Index submitter_id INT(10) - Primary key primary key description VARCHAR(255) NULL Description of data submitter

This table links a variation to a publication

Column Type Default value Description Index publication_id INT(10) - Primary key, internal identifier. primary key variation_id INT(10) - Primary key, foreign key references variation primary key data_source_attrib SET:

NULL Foreign key references to the attrib table. key: data_source_attrib_idx

This table stores information about structural variation.

Column Type Default value Description Index structural_variation_id INT(10) - Primary key, internal identifier. primary key variation_name VARCHAR(255) NULL The external identifier or name of the variation. e.g. "esv9549". unique: key alias VARCHAR(255) NULL Other structural variation name. source_id INT(10) - Foreign key references to the source table. key: source_idx study_id INT(10) NULL Foreign key references to the study table. key: study_idx class_attrib_id INT(10) 0 Foreign key references to the attrib table. Defines the type of structural variant.
The list of structural variation classes is available here. key: attrib_idx clinical_significance SET:

uncertain significance
not provided
benign
likely benign
likely pathogenic
pathogenic
drug response
histocompatibility
other
confers sensitivity
risk factor
association
protective
affects
likely pathogenic low penetrance
pathogenic low penetrance
uncertain risk allele
likely risk allele
established risk allele

NULL A set of clinical significance classes assigned to the structural variant.
The list of clinical significances is available here. validation_status ENUM:

validated
not validated
high quality

NULL Validation status of the variant. is_evidence TINYINT(4) 0 Flag indicating if the structural variation is a supporting evidence (1) or not (0). somatic TINYINT(1) 0 Flags whether this structural variation is known to be somatic or not copy_number TINYINT(2) NULL Add the copy number for the CNV supporting structural variants when available.

This table stores the associations between structural variations and their supporting evidences.

Column Type Default value Description Index structural_variation_id INT(10) - Primary key. Foreign key references to the structural_variation table. primary key
key: structural_variation_idx supporting_structural_variation_id INT(10) - Primary key. Foreign key references to the structural_variation table. primary key
key: supporting_structural_variation_idx

This table stores information about structural variation features (i.e. mappings of structural variations to genomic locations).

Column Type Default value Description Index structural_variation_feature_id INT(10) - Primary key, internal identifier. primary key seq_region_id INT(10) - Foreign key references seq_region in core db. Refers to the seq_region which this variant is on, which may be a chromosome, a clone, etc... key: pos_idx outer_start INT NULL The 5' outer bound position of the variation on the seq_region. seq_region_start INT - The start position of the variation on the seq_region. key: pos_idx inner_start INT NULL The 5' inner bound position of the variation on the seq_region. inner_end INT NULL The 3' inner bound position of the variation on the seq_region. seq_region_end INT - The end position of the variation on the seq_region. key: pos_idx outer_end INT NULL The 3' outer bound position of the variation on the seq_region. seq_region_strand TINYINT - The orientation of the variation on the seq_region. structural_variation_id INT(10) - Foreign key references to the structural_variation table. key: structural_variation_idx variation_name VARCHAR(255) NULL A denormalisation taken from the structural_variation table. This is the name or identifier that is used for displaying the feature (e.g. "esv9549"). source_id INT(10) - Foreign key references to the source table. key: source_idx study_id INT(10) NULL Foreign key references to the study table key: study_idx class_attrib_id INT(10) 0 Foreign key references to the attrib table. Defines the type of structural variant.
The list of structural variation classes is available here. key: attrib_idx allele_string LONGTEXT NULL The variant allele, where known. is_evidence TINYINT(1) 0 Flag indicating if the structural variation is a supporting evidence (1) or not (0). variation_set_id SET:

'' The structural variation feature can belong to a variation_set. key: variation_set_idx somatic TINYINT(1) 0 Flags whether this structural variation is known to be somatic or not breakpoint_order TINYINT(4) NULL Defines the order of the breakpoints when several events/mutation occurred for a structural variation (e.g. somatic mutations) length INT(10) NULL Length of the structural variant. Used for the variants with a class "insertion", when the size of the insertion is known. allele_freq FLOAT NULL The frequency reported for this allele in this study. allele_count INT(10) NULL The number of times this allele is observed in this study.

This table stores sample and strain information for structural variants and their supporting evidences.

Column Type Default value Description Index structural_variation_sample_id INT(10) - Primary key, internal identifier. primary key structural_variation_id INT(10) - Foreign key references to the structural_variation table. key: structural_variation_idx sample_id INT(10) NULL Foreign key references to the sample table. Defines the individual or sample name. key: sample_idx zygosity TINYINT(1) NULL Define the numeric zygosity of the structural variant for the sample, when available.

This table relates a single allele of a variation_feature to a motif feature (see Regulation documentation). It contains the consequence of the allele.

Column Type Default value Description Index motif_feature_variation_id INT(11) - Primary key, internal identifier. primary key variation_feature_id INT(11) - Foreign key references to the variation_feature table. key: variation_feature_idx feature_stable_id VARCHAR(128) NULL Foreign key to regulation databases. Unique stable id of related regulatory_feature. key: feature_stable_idx
key: somatic_feature_idx motif_feature_id INT(11) - Foreign key to regulation databases. Internal id of related motif_feature. allele_string TEXT NULL Shows the reference sequence and variant sequence of this allele. somatic TINYINT(1) 0 Flags if the associated variation is known to be somatic. key: somatic_feature_idx consequence_types SET:

TF_binding_site_variant
TFBS_ablation
TFBS_fusion
TFBS_amplification
TFBS_translocation

NULL The consequence(s) of the variant allele on this motif_feature.
The list of consequence descriptions is available here. key: consequence_type_idx binding_matrix_stable_id VARCHAR(60) NULL The stable id of the binding matrix. motif_start INT(11) NULL The start position of the variation in the motif. motif_end INT(11) NULL The end position of the variation in the motif. motif_score_delta FLOAT NULL The deviation from the score (that is derived from alignment software (e.g. MOODS)) caused by the variation. in_informative_position TINYINT(1) 0 Flags if the variation is in an informative position.

This table relates a single allele of a variation_feature to a regulatory feature (see Regulation documentation). It contains the consequence of the allele.

Column Type Default value Description Index regulatory_feature_variation_id INT(11) - Primary key, internal identifier. primary key variation_feature_id INT(11) - Foreign key references to the variation_feature table. key: variation_feature_idx feature_stable_id VARCHAR(128) NULL Foreign key to regulation databases. Unique stable id of related regulatory_feature. key: feature_stable_idx
key: somatic_feature_idx feature_type TEXT NULL The name of the feature type. allele_string TEXT NULL Shows the reference sequence and variant sequence of this allele. somatic TINYINT(1) 0 Flags if the associated variation is known to be somatic. key: somatic_feature_idx consequence_types SET:

regulatory_region_variant
regulatory_region_ablation
regulatory_region_fusion
regulatory_region_amplification
regulatory_region_translocation

NULL The consequence(s) of the variant allele on this regulatory feature.
The list of consequence descriptions is available here. key: consequence_type_idx

This table relates a single allele of a variation_feature to a transcript (see Core documentation). It contains the consequence of the allele e.g. intron_variant, non_synonymous_codon, stop_lost etc, along with the change in amino acid in the resulting protein if applicable.

Column Type Default value Description Index transcript_variation_id BIGINT - Primary key, internal identifier. primary key feature_stable_id VARCHAR(128) NULL Foreign key to core databases. Unique stable id of related transcript. key: somatic_feature_idx variation_feature_id INT(11) - Foreign key references to the variation_feature table. key: variation_feature_idx allele_string TEXT NULL Shows the reference sequence and variant sequence of this allele somatic TINYINT(1) 0 Flags if the associated variation is known to be somatic key: somatic_feature_idx consequence_types SET:

splice_acceptor_variant
splice_donor_variant
stop_lost
coding_sequence_variant
missense_variant
stop_gained
synonymous_variant
frameshift_variant
non_coding_transcript_variant
non_coding_transcript_exon_variant
mature_miRNA_variant
NMD_transcript_variant
5_prime_UTR_variant
3_prime_UTR_variant
incomplete_terminal_codon_variant
intron_variant
splice_region_variant
downstream_gene_variant
upstream_gene_variant
start_lost
stop_retained_variant
inframe_insertion
inframe_deletion
transcript_ablation
transcript_fusion
transcript_amplification
transcript_translocation
feature_elongation
feature_truncation
protein_altering_variant
start_retained_variant
splice_donor_5th_base_variant
splice_donor_region_variant
splice_polypyrimidine_tract_variant

NULL The consequence(s) of the variant allele on this transcript.
The list of consequence descriptions is available here. key: consequence_type_idx cds_start INT(11) NULL The start position of variation in cds coordinates. cds_end INT(11) NULL The end position of variation in cds coordinates. cdna_start INT(11) NULL The start position of variation in cdna coordinates. cdna_end INT(11) NULL The end position of variation in cdna coordinates. translation_start INT(11) NULL The start position of variation on peptide. translation_end INT(11) NULL The end position of variation on peptide. distance_to_transcript INT(11) NULL Only for upstream or downstream variants, it gives the distance from the start or the end of the transcript codon_allele_string TEXT NULL The reference and variant codons pep_allele_string TEXT NULL The reference and variant peptides hgvs_genomic TEXT NULL HGVS representation of this allele with respect to the genomic sequence hgvs_transcript TEXT NULL HGVS representation of this allele with respect to the [coding or non-coding] transcript hgvs_protein TEXT NULL HGVS representation of this allele with respect to the protein polyphen_prediction ENUM:

unknown
benign
possibly damaging
probably damaging

NULL The PolyPhen prediction for the effect of this allele on the protein polyphen_score FLOAT NULL The PolyPhen score corresponding to the prediction sift_prediction ENUM:

tolerated
deleterious
tolerated - low confidence
deleterious - low confidence

NULL The SIFT prediction for the effect of this allele on the protein sift_score FLOAT NULL The SIFT score corresponding to this prediction display INT(1) 1 Flags whether this transcript_variation should be displayed in browser tracks and returned by default by the API

This table is used in web index creation. It links a variation_id to the names of the genes the variation is within

Column Type Default value Description Index variation_id INT(10) - Primary key, foreign key references variation gene_name VARCHAR(255) - Primary key, display name of gene

This table is used in web index creation. It links a variation_id to all possible transcript and protein level change descriptions in HGVS annotation.

Column Type Default value Description Index variation_id INT(10) - Primary key, foreign key references variation hgvs_name VARCHAR(255) - Primary key, HGVS change description

This table contains the name of sets and subsets of variations stored in the database. It usually represents the name of the project or subproject where a group of variations has been identified.

Column Type Default value Description Index variation_set_id INT(10) - Primary key, internal identifier. primary key name VARCHAR(255) NULL Name of the set e.g. "Phenotype-associated variations". key: name_idx description TEXT NULL Description of the set. short_name_attrib_id INT(10) NULL Foreign key references to the attrib table. Short name used for web purpose.

See below the command to display the list of variation set entries, e.g. for human:

SELECT * FROM variation_set;

variation_set_id name description short_name_attrib_id 1 All failed variations Variations that have failed the Ensembl QC checks 179 2 Genotyping chip variants Variants which have assays on commercial chips held in ensembl 360 3 Illumina_CytoSNP12v1 Variants from the Illumina Cyto SNP-12 v1 whole genome SNP genotyping chip designed for cytogenetic analysis 339 4 Affy GenomeWideSNP_6.0 Variants from the Affymetrix Genome-Wide Human SNP Array 6.0 333 5 Illumina_Human660W-quad Variants from the Illumina Human660W-quad whole genome genotyping array designed for association studies 334 6 Illumina_1M-duo Variants from the Illumina Human1M-duo v3 whole genome genotyping array designed for association studies 335 7 Affy GeneChip 500K Variants from the Affymetrix GeneChip Human Mapping 500K Array Set 332 8 Illumina_Cardio-Metabo_Chip Variants from the Illumina Cardio-Metabo_Chip genotyping array designed to target variants of interest for metabolic and cardiovascular disease traits 337 9 Illumina_HumanOmni1-Quad Variants from the Illumina HumanOmni1-Quad whole genome genotyping array designed for association studies 338 10 Illumina_HumanHap650Y Variants from the Illumina HumanHap650Y v3.0 whole genome genotyping array designed for association studies 340 11 Illumina_HumanOmni2.5 Variants from the Illumina HumanOmni2.5 4v1 whole genome genotyping array designed for association studies 341 12 Illumina_Human610_Quad Variants from the Illumina Human610_Quad v1_B whole genome genotyping array designed for association studies 342 13 Illumina_HumanHap550 Variants from the Illumina Human550 v3.0 whole genome genotyping array designed for association studies 343 14 Illumina_HumanOmni5 Variants from the Illumina HumanOmni5v1 whole genome genotyping array designed for association studies 354 15 Illumina_ExomeChip Variants from the Illumina ExomeChip genotyping array designed to target variants within exons 358 16 Illumina_ImmunoChip Variants from the Illumina ImmunoChip genotyping array designed to target variants of interest for autoimmune and inflammatory diseases 359 17 HumanOmniExpress Variants from the Illumina HumanOmniExpress 12v1-1_a whole genome genotyping array 373 23 All phenotype/disease-associated variants Variants that have been associated with a phenotype or a disease 195 24 OMIM phenotype variants Variants linked to entries in the Online Mendelian Inheritance in Man (OMIM) database 194 25 UniProt variants Variants with annotations provided by UniProt 196 26 NHGRI-EBI catalog phenotype variants Variants associated with phenotype data from the NHGRI-EBI GWAS catalog [http://www.ebi.ac.uk/gwas/] 193 27 PhenCode Variants from the PhenCode Project 355 28 COSMIC phenotype variants Phenotype annotations of somatic mutations found in human cancers from the COSMIC project 197 29 HGMD-PUBLIC variants Variants annotated by HGMD 191 30 All ClinVar Variants with ClinVar annotation 374 31 Clinically associated variants Variants described by ClinVar as being probable-pathogenic, pathogenic, drug-response or histocompatibility 345 33 dbPEX Variants from the PEX Gene Database 396 34 KAT6BDB Variants from the K(lysine) acetyltransferase 6B database, BCM 399 35 HbVar Variants for the Human Hemoglobin Variants and Thalassemias database 397 36 LMDD Variants from the Leiden Muscular Dystrophy Database 400 37 OIVD Variants from the Osteogenesis Imperfecta Variant Database 401 38 PAHdb Variants from the Phenylalanine hydroxylase database 402 39 Infevers Variants from the registry of Hereditary Auto-inflammatory Disorders Mutations 398 40 HumanCoreExome-12 Variants from the Illumina HumanCoreExome-12 v1 genotyping chip. 390 41 All LSDB-associated variants Variants association from one or several Locus Specific DataBase (LSDB) 419 42 1000 Genomes 3 - All Variants genotyped by the 1000 Genomes project (phase 3) 404 43 1000 Genomes 3 - EUR - common Variants genotyped in European individuals by the 1000 Genomes project (phase 3) with frequency of at least 1% 415 44 1000 Genomes 3 - SAS - common Variants genotyped in South Asian individuals by the 1000 Genomes project (phase 3) with frequency of at least 1% 414 45 1000 Genomes 3 - EAS - common Variants genotyped in East Asian individuals by the 1000 Genomes project (phase 3) with frequency of at least 1% 413 47 1000 Genomes 3 - AMR - common Variants genotyped in admixed American individuals by the 1000 Genomes project (phase 3) with frequency of at least 1% 412 48 1000 Genomes 3 - AFR - common Variants genotyped in African individuals by the 1000 Genomes project (phase 3) with frequency of at least 1% 411 49 1000 Genomes 3 - All - common Variants genotyped by the 1000 Genomes project (phase 3) with frequency of at least 1% 410 50 1000 Genomes 3 - EUR Variants genotyped in European individuals by the 1000 Genomes project (phase 3) 409 51 1000 Genomes 3 - SAS Variants genotyped in South Asian individuals by the 1000 Genomes project (phase 3) 408 52 1000 Genomes 3 - EAS Variants genotyped in East Asian individuals by the 1000 Genomes project (phase 3) 407 53 1000 Genomes 3 - AFR Variants genotyped in African individuals by the 1000 Genomes project (phase 3) 405 54 1000 Genomes 3 - AMR Variants genotyped in admixed American individuals by the 1000 Genomes project (phase 3) 406 55 ESP_6500 Variants from the NHLBI Exome Sequencing Project (investigating heart, lung and blood disorders) 344 56 ExAC Variants identified by the Exome Aggregation Consortium (ExAC) - release 0.3 420 57 gnomAD Variants reported by the Genome Aggregation Database 586

A table for mapping structural variations to variation_sets.

Column Type Default value Description Index structural_variation_id INT(10) - Primary key. Foreign key references to the structural_variation table. primary key variation_set_id INT(10) - Primary key. Foreign key references to the variation_set table. primary key

This table stores hierarchical relationships between variation sets by relating them as variation sets and variation subsets.

Column Type Default value Description Index variation_set_super INT(10) - Primary key. Foreign key references to the variation_set table. primary key
key: sub_idx variation_set_sub INT(10) - Primary key. Foreign key references to the variation_set table. primary key
key: sub_idx

A table for mapping variations to variation_sets.

Column Type Default value Description Index variation_id INT(10) - Primary key. Foreign key references to the variation table. primary key
key: variation_set_idx variation_set_id INT(10) - Primary key. Foreign key references to the variation_set table. primary key
key: variation_set_idx

This table stores information about each of a variation's alleles, along with population frequencies.

Column Type Default value Description Index allele_id INT(11) - Primary key, internal identifier. primary key variation_id INT(11) - Foreign key references to the variation table. key: variation_idx subsnp_id INT(11) NULL Foreign key references to the subsnp_handle table. key: subsnp_idx allele_code_id INT(11) - Foreign key reference to allele_code table. population_id INT(11) NULL Foreign key references to the population table. key: population_idx frequency FLOAT NULL Frequency of this allele in the population. count INT(11) NULL Number of individuals/samples in the population where this allele is found. frequency_submitter_handle INT(10) NULL dbSNP handle for submitter of frequency data [may be different to submitter of observed variant]

This table allows for the allele of a variant to have multiple IDs.

Column Type Default value Description Index allele_synonym_id INT(10) - Primary key, internal identifier. primary key variation_id INT(10) - Foreign key references to the variation table. unique key: variation_name_idx hgvs_genomic VARCHAR(600) - HGVS representation of this allele with respect to the genomic sequence name VARCHAR(255) - Name of the allele synonym e.g. CA127784 unique key: variation_name_idx
key: name_idx

This is the schema's generic representation of a variation, defined as a genetic feature that varies between individuals of the same species. The most common type is the single nucleotide variation (SNP) though the schema also accommodates copy number variations (CNVs) and structural variations (SVs).
In Ensembl, a variation is defined by its flanking sequence rather than its mapped location on a chromosome; a variation may in fact have multiple mappings across a genome, although this fails our Quality Control.
This table stores a variation's name (commonly an ID of the form e.g. rs123456, assigned by dbSNP).

Column Type Default value Description Index variation_id INT(10) - Primary key, internal identifier. primary key source_id INT(10) - Foreign key references to the source table. key: source_idx name VARCHAR(255) NULL Name of the variation. e.g. "rs1333049". unique: key flipped TINYINT(1) NULL This is set to 1 if the variant is flipped from the negative to the positive strand during import. class_attrib_id INT(10) 0 Class of the variation, key into the attrib table.
The list of variation classes is available here. somatic TINYINT(1) 0 Flags whether this variation is known to be somatic or not minor_allele VARCHAR(50) NULL The minor allele of this variant. The minor allele is the second most frequent allele. minor_allele_freq FLOAT NULL The 'global' frequency of the minor allele of this variant, as reported by dbSNP. The minor allele frequency is the frequency of the second most frequent allele. minor_allele_count INT(10) NULL The number of samples the minor allele of this variant is found in. The minor allele is the second most frequent allele. clinical_significance SET:

uncertain significance
not provided
benign
likely benign
likely pathogenic
pathogenic
drug response
histocompatibility
other
confers sensitivity
risk factor
association
protective
affects
likely pathogenic low penetrance
pathogenic low penetrance
uncertain risk allele
likely risk allele
established risk allele

NULL A set of clinical significance classes assigned to the variant.
The list of clinical significances is available here. evidence_attribs SET:

NULL A summary of the evidence supporting a variant as a guide to its potential reliability. See the evidence descriptions here. display INT(1) 1 Flags whether this variation should be displayed in browser tracks and returned by default by the API

This table stores miscellaneous attributes associated with a variation entry.

Column Type Default value Description Index variation_id INT(11) - Foreign key references variation table key: variation_idx attrib_id INT(11) NULL Foreign key references attrib table, describes the attribute key: attrib_value_idx value VARCHAR(255) NULL Attribute value key: attrib_value_idx

This table represents mappings of variations to genomic locations. It stores an allele string representing the different possible alleles that are found at that locus e.g. "A/T" for a SNP, as well as a "worst case" consequence of the mutation. It also acts as part of the relationship between variations and transcripts.

This table allows for a variation to have multiple IDs, generally given by multiple sources.

Column Type Default value Description Index variation_synonym_id INT(10) - Primary key, internal identifier. primary key variation_id INT(10) - Foreign key references to the variation table. key: variation_idx
unique key: name_idx subsnp_id INT(15) NULL Foreign key references to the subsnp_handle table. key: subsnp_idx source_id INT(10) - Foreign key references to the source table. unique key: name_idx
key: source_idx name VARCHAR(255) NULL Name of the synonym variation. e.g. 'rs1333049'. unique key: name_idx

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4