language-icon Old Web
English
Sign In

Tag SNP

A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high linkage disequilibrium that represents a group of SNPs called a haplotype. It is possible to identify genetic variation and association to phenotypes without genotyping every SNP in a chromosomal region. This reduces the expense and time of mapping genome areas associated with disease, since it eliminates the need to study every individual SNP. Tag SNPs are useful in whole-genome SNP association studies in which hundreds of thousands of SNPs across the entire genome are genotyped. A tag SNP is a representative single nucleotide polymorphism (SNP) in a region of the genome with high linkage disequilibrium that represents a group of SNPs called a haplotype. It is possible to identify genetic variation and association to phenotypes without genotyping every SNP in a chromosomal region. This reduces the expense and time of mapping genome areas associated with disease, since it eliminates the need to study every individual SNP. Tag SNPs are useful in whole-genome SNP association studies in which hundreds of thousands of SNPs across the entire genome are genotyped. Two loci are said to be in linkage equilibrium (LE) if their inheritance is an independent event. If the alleles at those loci are non-randomly inherited then we say that they are at linkage disequilibrium (LD). LD is most commonly caused by physical linkage of genes. When two genes are inherited on the same chromosome, depending on their distance and the likelihood of recombination between the loci they can be at high LD. However, LD can be also observed due to functional interactions where even genes from different chromosomes can jointly confer an evolutionarily selected phenotype or can affect the viability of potential offspring. In families LD is highest because of the lowest numbers of recombination events (fewest meiosis events). This is especially true between inbred lines. In populations LD exists because of selection, physical closeness of the genes that causes low recombination rates or due to recent crossing or migration. On a population level, processes that influence linkage disequilibrium include genetic linkage, epistatic natural selection, rate of recombination, mutation, genetic drift, random mating, genetic hitchhiking and gene flow. When a group of SNPs are inherited together because of high LD there tends to be redundant information. The selection of a tag SNP as a representative of these groups reduces the amount of redundancy when analyzing parts of the genome associated with traits/diseases. The regions of the genome in high LD that harbor a specific set of SNPs that are inherited together are also known as haplotypes. Therefore, tag SNPs are representative of all SNPs within a haplotype. The selection of tag SNPs is dependent on the haplotypes present in the genome. Most sequencing technologies provide the genotypic information and not the haplotypes i.e. they provide information on the specific bases that are present but do not provide phasic information (at which specific chromosome each of the bases appear). Determination of haplotypes can be done through molecular methods (Allele Specific PCR, Somatic cell hybrids). These methods distinguish which allele is present at which chromosome by separating the chromosomes before genotyping. They can be very time-consuming and expensive, so statistical inference methods have been developed as a less expensive and automated option. These statistical-inference software packages utilize parsimony, maximum likelihood, and Bayesian algorithms to determine haplotypes. Disadvantage of statistical-inference is that a proportion of the inferred haplotypes could be wrong. When haplotypes are used for genome wide association studies, it is important to note the population being studied. Often different populations will have different patterns of LD. One example of differentiating patterns are African-descended populations vs. European and Asian-descended populations. Since humans originated in Africa and spread into Europe and then the Asian and American continents, the African populations are the most genetically diverse and have smaller regions of LD while European and Asian-descended populations have larger regions of LD due to founder effect. When LD patterns differ in populations, SNPs can become disassociated with each other due to the changes in haplotype blocks. This means that tag SNPs, as representatives of the haplotype blocks, are unique in populations and population differences should be taken into account when performing association studies. Almost every trait has both genetic and environmental influence. Heritability is the proportion of phenotypic variance that is inherited from our ancestors. Association studies are used to determine the genetic influence on phenotypic presentation. Although mostly used for mapping diseases to genomic areas they can be used to map heritability of any phenotype like height, eye color etc. Genome-wide association studies (GWAS) use single-nucleotide polymorphisms (SNPs) to identify genetic associations with clinical conditions and phenotypic traits. They are hypothesis free and use a whole-genome approach to investigate traits by comparing a large group of individuals that express a phenotype with a large group of people that don't. The ultimate goal of GWAS is to determine genetic risk factors that can be used to make predictions about who is at risk for a disease, what are the biological underpinnings of disease susceptibility and creating new prevention and treatment strategies. The National Human Genome Research Institute and the European Bioinformatics Institute publishes the GWAS Catalog, a catalog of published genome-wide association studies that highlights statistically significant associations between hundreds of SNPs with a broad range of phenotypes. Due to the large number of possible SNP variants (more than 149 million as of June 2015 ) it is still very expensive to sequence all SNPs. That is why GWAS use customizable arrays (SNP chips) to genotype only a subset of the variants identified as tag snps. Most GWAS use products from the two primary genotyping platforms. The Affymetrix platform prints DNA probes on a glass or silicone chip that hybridize to specific alleles in the sample DNA. The Illumina platform uses bead-based technology, with longer DNA sequences and produces better specificity. Both platforms are able to genotype more than a million tag SNPs using either pre-made or custom DNA oligos.

[ "Genotyping", "SNP", "Single-nucleotide polymorphism", "Genetic association", "Haplotype" ]
Parent Topic
Child Topic
    No Parent Topic
Baidu
map