Microsatellites and their importance :
Genomes are scattered with simple repeats called microsatellites. A microsatellite consists of a specific sequence of DNA bases which contains mono-, di-, tri-, or tetra-, penta- or hexa- tandem repeats. Repeats of longer units form minisatellites or, in the extreme case, satellite DNA. The term satellite DNA originates from the observation in the 1960s of a fraction of sheared DNA that showed a distinct buoyant density, detectable as a ‘satellite peak’ in density gradient centrifugation, and that was subsequently identified as large centromeric tandem repeats.When shorter (10–30-bp) tandem repeats were later identified, they came to be known as minisatellites. Finally,with the discovery of tandem iterations of simple sequence motifs, the term microsatellites was coined.
It is appropriate to study association of microsatellites with coding sequence as this is related to the mutational and selective forces that operate on different types of repeat. The bulk of simple repeats are embedded in non-coding DNA, either in the intergenic sequence or in the introns. Microsatellites that are used as genetic markers are usually of this type and are generally assumed to evolve neutrally. Their frequency and distribution should therefore reflect the underlying mutation process. In coding DNA, selection against frameshift mutations effectively hinders the expansion of everything other than trinucleotide repeats (Metzgar and Wills, 2000) for which there might be further length constraints related to protein function (Alba and Guigo, 2004).
Trinucleotide repeats associated with human disease comprise a special class of microsatellites in coding DNA. These loci undergo extensive repeat expansions, the mutational mechanism of which is thought to differ from that of most microsatellites in the genome. For instance, the establishment of hairpin structures with a relatively high amount of base-pair complementarities might stabilize loops that are generated during replication slippage.
Microsatellite density tends to positively correlate with genome size (Hancock, 1996; Toth et al., 2000; Katti et al., 2001) . Among fully sequenced eukaryotic genomes, microsatellite density is highest in mammals. However, in plants, microsatellite frequency is negatively correlated with genome size (Morgante et al., 2002). This has been attributed to the fact that microsatellites are underrepresented in the repetitive parts of the plant genome that are involved in genome expansion, such as the long terminal repeats of retrotransposons (Morgante et al., 2002). The contrasting distributions of microsatellite motifs in different genomes strongly indicate that there is interspecific variation in the mechanisms of mutation or repair of specific motifs. Alternatively, there might be variation in the selective constraints that are associated with different microsatellite motifs.
Microsatellites are also frequently found in the proximity of interspersed repetitive elements such as short interspersed repeats (SINEs) and long interspersed elements (LINEs).