Preliminary Discovery of Repetitive Elements in the Genome
of the Sea Cucumber Holothuria scabra Jaeger, 1833

Delbert Almerick T. Boncan1,2, Iris Diana C. Uy1,
Crimson C. Tayco1, and Arturo O. Lluisma1,3*

1Marine Science Institute, College of Science,
University of the Philippines Diliman, Quezon City 1101 Philippines
2National Institute of Molecular Biology and Biotechnology, College of Science,
University of the Philippines Diliman, Quezon City 1101 Philippines
3Philippine Genome Center, University of the Philippines Diliman,
Quezon City 1101 Philippines

corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.

Various classes of repetitive elements exist in the genomes of organisms. Characterizing these genomic elements is important not only because of the potential insights on the biology and evolution of their host's genomes but also because of the potential practical applications that such information might yield. So far, little is known about the types of repetitive elements in the genome of holothurids. In this study, we generated a partial sequence of the genome of the sea cucumber, Holothuria scabra, and searched for tandem and interspersed repetitive elements using various approaches. We conducted the same search on another sea cucumber, Parastichopus parvimensis, using its publicly available genome sequence. The perfect microsatellite profiles of both sea cucumbers show similarities to some known patterns in eukaryotes. The combined perfect and imperfect microsatellite data sets also highlight fundamental microsatellite profile dissimilarities between the two holothurids. This study demonstrates that as much as half of microsatellites in a holothurid genome remain unidentified in perfect repeat scans, and highlights the importance of imperfect repeat-inclusive searches. This study also demonstrates that partial genome sequencing may be used as a cheaper and more efficient alternative to the traditional methods of developing microsatellite markers for H. scabra. On the other hand, combined approach of sequence similarity-based and de novo search of interspersed repeats reveals a diverse subclass/superfamily of transposable elements in the genomes of H. scabra and P. parvimensis. The two species exhibit similar patterns of repeat profiles notwithstanding the disparity in the number of predicted transposable elements. Notably, the major subclass/superfamily identified in the two genomes include DNA/hAT-Blackjack, DNA/hAT-Tip100, DNA/Maverick, RC/Helitron, LINE/L2,  LTR/Gypsy, SINE/MIR and SINE/tRNA. The interspersed repeats identified in the study presents the first attempt to survey the transposable elements from the genomes of these two holothurids.

It has long been known that repetitive elements can account for a sizeable fraction of many eukaryotic genomes (Britten and Kohne 1968). Depending on the species, this proportion can vary from a few percent (3% in Saccharomyces cerevisiae, Kim et al. 1998) to a significant amount (e.g. > 80% in maize, Schnable et al. 2009). These repeats are classified as tandem or interspersed repeats based on their sequence characteristics and the mechanism of their generation and replication in the genome. Tandem repeats are comprised by either microsatellites or minisatellites. Microsatellites can exhibit high levels of intraspecific polymorphisms and thus have emerged as popular genetic markers for a wide range of applications in population genetics, conservation biology and evolutionary biology (Goldstein and Schlotterer 1999). On the other hand, interspersed repeats are mainly comprised by transposable elements (TEs) which were initially considered as selfish and junk genetic elements. . . . . read more

BAO Z, EDDY SR. 2002. Automated De Novo Identification of Repeat Sequence Families in Sequenced Genomes. Genome Res. 12:1269–1276.
BRITTEN R, KOHNE D. 1968. Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms. Science (80-. ). 161:529–540. Clark RM, Bhaskar SS, Miyahara M, Dalgliesh GL, Bidichandani SI. 2006. Expansion of GAA trinucleotide repeats in mammals. Genomics 87:57–67.
BUSCHIAZZO E, GEMMELL NJ. 2006. The rise, fall and renaissance of microsatellites in eukaryotic genomes. BioEssays 28:1040–50.
CLARK RM, BHASKAR SS, MIYAHARA M, DALGLIESH GL, BIDICHANDANI SI. 2006. Expansion of GAA trinucleotide repeats in mammals. Genomics 87:57–67.
DELGRANGE O, RIVALS E. 2004. STAR: an algorithm to Search for Tandem Approximate Repeats. Bioinformatics 20:2812–2820.
ESTOUP A, WILSON IJ, SULLIVAN C, CORNUET JM, MORITZ C. 2001. Inferring population history from microsatellite and enzyme data in serially introduced cane toads, Bufo marinus. Genetics 159:1671–1687.
ESTOUP A, ARNE P, CORNUET J-M. 2002. Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis. Mol. Ecol. 11:1591–1604.
ESTOUP A, SOLIGNAC M, HARRY M, CORNUET JM. 1993. Characterization of (GT)n and (CT)n microsatellites in two insect species: Apis mellifera and Bombus terrestris. Nucleic Acids Res. 21:1427–1431.
FESCHOTTE C, KESWANI U, RANGANATHAN N, GUIBOTSY ML, LEVINE D. 2009. Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes. Genome Biol. Evol. 1:205–220.
FESCHOTTE C, PRITHAM EJ. 2007. DNA transposons and the evolution of eukaryotic genomes. Annu. Rev. Genet. 41:331–368.
GARZA JC, SLATKIN M, FREIMER NB. 1995. Microsatellite allele frequencies in humans and chimpanzees, with implications for constraints on allele size. Mol. Biol. Evol. 12:594–603.
GOLDSTEIN D, SCHLOTTERER C. 1999. Microsatellites: Evolution and Appications. Oxford Univ. Press.
GROVER A, AISHWARYA V, SHARMA PC. 2007. Biased distribution of microsatellite motifs in the rice genome. Mol. Genet. Genomics 277:469–480.
GUICHOUX E, LAGACHE L, WAGNER S, CHAUMEIL P, LÉGER P, LEPAIS O, LEPOITTEVIN C, MALAUSA T, REVARDEL E, SALIN F, ET AL. 2011. Current trends in microsatellite genotyping. Mol. Ecol. Notes 11:591–611.
KANG JH, KIM YK, KIM MJ, PARK JY, AN CM, KIM BS, JUN JC, KIM SK. 2011. Genetic differentiation among populations and color variants of sea cucumbers (Stichopus japonicus) from Korea and China. Int. J. Biol. Sci. 7:323–332.
KIM J, VANGURI S, BOEKOE J, GABRIEL A, VOYTAS D. 1998. Transposable elements and genome organization: a comprehensive survey of retrotransposons revealed by the complete Saccharomyces cerevisiae genome sequence. Genome Res. 8:464–478.
KOLPAKOV R, BANA G, KUCHEROV G. 2003. mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res. 31:3672–3678.
KONDO M, AKASAKA K. 2012. Current Status of Echinoderm Genome Analysis - What do we Know? Curr. Genomics 13:134–143.
KRULL M, PETRUSMA M, MAKALOWSKI W, BROSIUS J, SCHMITZ J. 2007. Functional persistence of exonized mammalian-wide interspersed repeat elements (MIRs). Genome Res. 17:1139–1145.
LANDER ES, LINTON LM, BIRREN B, NUSBAUM C, ZODY MC, BALDWIN J, DEVON K, DEWAR K, DOYLE M, FITZHUGH W, ET AL. 2001. Initial sequencing and analysis of the human genome. Nature 409:860–921.
LEGENDRE M, POCHET N, PAK T, VERSTREPEN KJ. 2007. Sequence-based estimation of minisatellite and microsatellite repeat variability. Genome Res.:1787–1796.
LERAT E. 2010. Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs. Heredity (Edinb). 104:520–533.
LEVINSON G, GUTMAN GA. 1987. Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4:203–221.
MALPERTUY A, DUJON B, RICHARD GF. 2003. Analysis of microsatellites in 13 hemiascomycetous yeast species: mechanisms involved in genome dynamics. J. Mol. Evol. 56:730–741.
MARGULIES M, EGHOLM M, ALTMAN WE, ATTIYA S, BADER JS, BEMBEN L A, BERKA J, BRAVERMAN MS, CHEN Y-J, CHEN Z, ET AL. 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–780.
MEGLÉCZ E, NÈVE G, BIFFIN E, GARDNER MG. 2012. Breakdown of phylogenetic signal: a survey of microsatellite densities in 454 shotgun sequences from 154 non model eukaryote species. PLoS One 7:e40861.
MORGANTE M, HANAFEY M, POWELL W. 2002. Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat. Genet. 30:194–200.
MUOTRI AR, MARCHETTO MCN, COUFAL NG, GAGE FH. 2007. The necessary junk: new functions for transposable elements. Hum. Mol. Genet. 16 Spec No:R159–167.
O’DUSHLAINE CT, EDWARDS RJ, PARK SD, SHIELDS DC. 2005. Tandem repeat copy-number variation in protein-coding regions of human genes. Genome Biol. 6:R69.
PEARSON CE, NICHOL EDAMURA K, CLEARY JD. 2005. Repeat instability: mechanisms of dynamic mutations. Nat. Rev. Genet. 6:729–742.
PELIZZOLA M, ECKER JR. 2011. The DNA methylome. FEBS Lett. 585:1994–2000.
PRICE AL, JONES NC, PEVZNER PA. 2005. De novo identification of repeat families in large genomes. Bioinformatics 21 Suppl 1:i351–358.
PRIMMER CR, SAINO N, MØLLER AP, ELLEGREN H. 1996. Directional evolution in germline microsatellite mutations. Nat. Genet. 13:391–393.
RICHARD G-F, KERREST A, DUJON B. 2008. Comparative genomics and molecular dynamics of DNA repeats in eukaryotes. Microbiol. Mol. Biol. Rev. 72:686–727.
SAINUDIIN R, DURRETT RT, AQUADRO CF, NIELSEN R. 2004. Microsatellite mutation models: insights from a  comparison of humans and chimpanzees. Genetics 168:383–395.
SAINZ J, PRATS E, RUIZ S, CORNUDELLA L. 1992. Organization of repetitive DNA sequences in the genome of the echinoderm Holothuria tubulosa. Biochimie 74:1067–1074.
Sea Urchin Sequencing Consortium. 2006. The genome of the sea urchin Stronglyocentrotus purpuratus. Science. 314:941–952.
SELKOE KA, TOONEN RJ. 2006. Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecol. Lett. 9:615–629.
SHARMA PC, GROVER A, KAHL G. 2007. Mining microsatellites in eukaryotic genomes. Trends Biotechnol. 25:490–498.
SIBLY RM, MEADE A, BOXALL N, WILKINSON MJ, CORNE DW, WHITTAKER JC. 2003. The structure of interrupted human AC microsatellites. Mol. Biol. Evol. 20:453–459.
SMIT A, HUBLEY R, GREEN P. 2010. RepeatMasker Open-3.0.
STEFANINI FM, FELDMAN MW. 2000. Bayesian estimation of range for microsatellite loci. Genet. Res. 75:167– 177.
TANG J, BALDWIN SJ, JACOBS JM, LINDEN CG VAN DER, VOORRIPS RE, LEUNISSEN JA, VAN ECK H, VOSMAN B. 2008. Large-scale identification of polymorphic microsatellites using an in silico approach. BMC Bioinformatics 9:374.
TEMNYKH S, DECLERCK G, LUKASHOVA A, LIPOVICH L, CARTINHOUR S, MCCOUCH S. 2001. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res. 11:1441–1452.
THIEL T, MICHALEK W, VARSHNEY RK, GRANER A. 2003. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor. Appl. Genet. 106:411–422.
THIEL T. 2004. MISA - MIcroSAtellite identification tool (update, September 2010).
THOMAS EE. 2005. Short, local duplications in eukaryotic genomes. Curr. Opin. Genet. Dev. 15:640–4. Thompson JM, Salipante SJ. 2009. PeakSeeker: a program for interpreting genotypes of mononucleotide repeats. BMC Res. Notes 2:17.  
USTINOVA J, ACHMANN R, CREMER S, MAYER F. 2006. Long repeats in a huge genome: microsatellite loci in the grasshopper Chorthippus biguttulus. J. Mol. Evol. 62:158–167.
VERSTREPEN KJ, JANSEN A, LEWITTER F, FINK GR. 2005. Intragenic tandem repeats generate functional variability. Nat. Genet. 37:986–90.
WHITTAKER JC, HARBORD RM, BOXALL N, MACKAY I, DAWSON G, SIBLY RM. 2003. Likelihood-based estimation of microsatellite mutation rates. Genetics 164:781–787.
ZANE L, BARGELLONI L, PATARNELLO T. 2002. Strategies for microsatellite isolation: a review. Mol. Ecol. 11:1– 16.
ZHAO X-N, USDIN K. 2015. The Repeat Expansion Diseases: The dark side of DNA repair. DNA Repair (Amst). 32:96–105.