Differential Expression Analysis in High-yielding and Low-yielding Philippine Coconut through Transcriptome Sequencing

Ma. Regina Punzalan1,2, Gamaliel Lysander Cabria1,2, Ma. Anita Bautista1,2,
Ernesto Emmanuel3, Ramon Rivera3, Susan Rivera3, and Cynthia Saloma1,2*

1Philippine Genome Center (PGC), University of the Philippines,
Diliman, Quezon City 1101 Philippines
2National Institute of Molecular Biology and Biotechnology, University of the Philippines,
Diliman, Quezon City 1101 Philippines
3Philippine Coconut Authority – Zamboanga Research Center (PCA-ZRC),
San Ramon, Zamboanga City 7000 Philippines

*Corresponding Author: This email address is being protected from spambots. You need JavaScript enabled to view it.





The demand for coconut oil (CNO) continues to rise in the global market. This puts pressure for coconut-producing countries such as the Philippines to increase CNO and copra production. Baybay Tall (BAYT) is known to have the highest copra yield among the tall coconut varieties in the Philippines. However, traditional breeding techniques that rely on the use of morphological markers are very limited, laborious, and time-consuming. In order to improve breeding strategies for increased copra production, differential gene expression analysis was performed on coconut shell and kernel of high-yielding and low-yielding palms. High-quality RNA was isolated from the endosperm (ES or kernel) and endocarp (EC or shell) of nut tissues followed by transcriptome sequencing using Illumina HiSeq2000. De novo transcriptome assembly was performed using Trinity. Read abundance was estimated using Corset and differentially expressed genes were identified using edgeR. In total, 1,945 genes were found to be differentially expressed (FDR < 0.05) from the nut tissues. Annotation of the transcripts revealed that only 82 of the differentially expressed genes have significant annotation. Potential gene-targeted markers (GTMs) were designed for 64 candidate genes, which can be further validated for possible use in the marker-assisted selection of high-yielding palms. Microsatellite (SSR) sequences were identified in 19,147 unigenes in the EC and 17,394 in the ES. However, only two SSRs were found among differentially expressed genes in the EC and only one in the ES. Functional analysis revealed that high nut yield could arise from concerted actions of several transcription activators and regulatory proteins leading to increased cell division, secondary cell wall formation, enhanced energy metabolism, and activated stress response. Taken together, these processes contribute to increased kernel volume and thus increase in copra yield. Identified genes in this study can be used as potential targets in improving productivity in the Philippine coconut.



Copra, or dried coconut kernel, is a highly resourced commodity in the tropics. CNO is extracted from copra and the residue from extraction can be used as feeds for livestock. The use of CNO is favored over other vegetable oils due to its lower long chain fatty acid and higher medium chain saturated fatty acids content, higher burning point, and its perceived medical advantages (Young 1983, Dyer et al. 2008, DebMandal and Mandal 2011). . . . read more



Amano Y, Tsubouchi H, Shinohara H, Ogawa M,  Matsubayashi Y. 2007. Tyrosine-sulfated glycopeptide involved in cellular proliferation and expansion in Arabidopsis. Proceedings of the National Academy of Sciences 104(46): 18333–18338.
AUGUSTINE RC, YORK SL, RYTZ TC, VIERSTRA RD. 2016. Defining the SUMO system in maize: SUMOylation is up-regulated during endosperm development and rapidly induced by stress. Plant Physiology 171(3): 2191–2210.
Balandín M, Royo J, Gómez E, Muniz LM, Molina A, Hueros G. 2005. A protective role for the embryo surrounding region of the maize endosperm, as evidenced by the characterisation of ZmESR-6, a defensin gene specifically expressed in this region. Plant Mol. Biol. 58: 269–282.
Batugal P, Bourdeix R, Baudouin L. 2009. Chapter 10: Coconut Breeding. In: Jain SM, Priyadarshan PM eds. Breeding Plantation Tree Crops: Tropical Species. New York: Springer New York.
Batugal P, Rao VR, Oliver J eds. 2005. Coconut Genetic Resources. Serdang, Malaysia: International Plant Genetic Resources Institute – Regional Office for Asia, the Pacific and Oceania (IPGRI-APO).
Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30(15): 2114–2120.
Blilou I, Frugier F, Folmer S, Serralbo O, Willemsen V, Wolkenfelt H, ELOY NB, FERREIRA PC, WEISBEEK P, SCHERES B. 2002. The Arabidopsis HOBBIT gene encodes a CDC27 homolog that links the plant cell cycle to progression of cell differentiation. Genes & Development 16(19): 2566–2575.
CARMADELLA L, CARRATORE V, CIARDIELLO MA, SERVILLO L, BALESTREIRI C, GIOVANE A. 2000. Kiwi protein inhibitor of pectin methylesterase: Amino-acid sequence and structural importance of two disulfide bridges. European J of Biochemistry 267(14): 4561–4565.
CARRARI F, FERNIE AR. 2006. Metabolic regulation underlying tomato fruit development. J Exp. Bot. 57: 1883–1897.
CHO SK, KIM JE, PARK JA, EOM TJ, KIM WT. 2006. Constitutive expression of abiotic stress-inducible hot pepper CaXTH3, which encodes a xyloglucan endotransglucosylase/hydrolase homolog, improves drought and salt tolerance in transgenic Arabidopsis plants. FEBS Letters 580(13): 3136–3134.
Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. 2005. Blast2GO: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21: 3674–3676.
Davidson NM, Oshlack A. 2014. Corset: Enabling differential gene expression analysis for de novo assembled transcriptomes. Genome Biol. 15: 410.
DebMandal M, Mandal S. 2011. Coconut (Cocos nucifera L.: Arecaceae): In health promotion and disease prevention. Asian Pac. J Trop. Med. 4: 241–247.
Dyer JM, Stymne S, Green AG, Carlsson AS. 2008. High-value oils from plants. Plant J 54: 640–655.
Eastmond PJ, Germain V, Lange PR, Bryce JH, Smith SM, Graham IA. 2000. Postgerminative growth and lipid catabolism in oilseeds lacking the glyoxylate cycle. Proc. Natl. Acad. Sci. 97: 5669–5674.
Fan H, Xiao Y, Yang Y, Xia W, Mason AS, Xia Z, Qiao F, Zhao S, Tang H. 2013. RNA-Seq Analysis of Cocos nucifera: Transcriptome Sequencing and De Novo Assembly subsequent functional genomics approaches. PLoS One 8(3): 359997.
[FAO] Food and Agriculture Organization. 2001. Non-forest tree plantations. Report based on the work of W. Killmann. Forest Plantation Thematic Papers, Working Paper 6.
GAO LL, XUE HW. 2012. Global analysis of expression profiles of rice receptor-like kinase genes. Molecular Plant 5(1): 143–153.
GÖTZ S, GARCÍA-GÓMEZ JM, TEROL J, WILLIAMS TD, NAGARAJ SH, NUEDA MJ, ROBLES M, TALÓN M, DOPAZO J, CONESA A. 2008. High-throughput functional annotation and data mining with the Blast2Go suite. Nucleic Acids Research 36(10): 3420–3435.
GRABHERR MG, HAAS BJ, YASSOUR M, LEVIN JZ, THOMPSON DA, AMIT I, ADICONIS X, FAN L, RAYCHOWDHURY R, ZENG Q, CHEN Z, MAUCELI E, HACOHEN N, GNIRKE A, RHIND N, DI PALMA F, BIRREN BW, NUSBAUM C, LINDBLAD-TOH K, FRIEDMAN N, REGEV A. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29(7): 644–652.
Herrán A, Estioko L, Becker D, Rodriguez MJB, Rohde W, Ritter E. 2000. Linkage mapping and QTL analysis in coconut (Cocos nucifera L.). Theor. Appl. Genet. 101: 292–300.
INDEX MUNDI. 2019. Philippine Coconut Oil Exports by Year. Retrieved from on 11 Mar 2019.
JONES P, BINNS D, CHANG HY, FRASER M, LI W, MCANULLA C, MCWILLIAM H, MASLEN J, MITCHELL A, NUKA G, PESSEAT S, QUINN AF, SANGRADOR-VEGAS A, SCHEREMETJEW M, YONG SY, LOPEZ R, HUNTER S. 2014. InterProScan 5: Genome-scale protein function classification. Bioinformatics 30: 1236–1240.
KIM SY, PAENG SK, NAWKAR GM, MAIBAM P, LEE ES, KIM KS, LEE DH, PARK DJ, KANG SB, KIM MR, LEE JH, KIM YH, KIM WY, KANG CH. 2011. The 1-Cys peroxiredoxin, a regulator of seed dormancy, functions as a molecular chaperone under oxidative stress conditions. Plant Science 181(2): 119–124.
Kozieradzka-Kiszkurno M, Płachno BJ. 2012. Are there symplastic connections between the endosperm and embryo in some angiosperms? A lesson from the Crassulaceae family. Protoplasma 249: 1081–1089.  
Liu Y, He Z, Appels R, Xia X. 2012. Functional markers in wheat: Current status and future prospects. Theor. Appl. Genet. 125: 1–10.
Lujaji F, Bereczky A, Janosi L, Novak C, Mbarawa M. 2010. Cetane number and thermal properties of vegetable oil, biodiesel, 1-butanol and diesel blends. J Therm. Anal. Calorim. 102: 1175–1181.
Murphy DJ, Rawsthorne S, Hills MJ. 1993. Storage lipid formation in seeds. Seed Sci. Res. 3: 79–95.
Musacchia F, Basu S, Petrosino G, Salvemini M, Sanges R. 2015. Annocript: A flexible pipeline for the annotation of transcriptomes able to identify putative long noncoding RNAs. Bioinforma. Oxf. Engl. 31: 2199–2201.
Nejat N, Cahill DM, Vadamalai G, Ziemann M, Rookes J. Naderali N. 2015. Transcriptomics-based analysis using RNA-Seq of the coconut (Cocos nucifera) leaf in response to yellow decline phytoplasma infection. Mol. Genet. Genomics MGG 290: 1899–1910.
[PCA] Philippine Coconut Authority. 2019. History of the coconut industry in the Philippines. Retrieved from on 3 Jul 2019.
Poczai P, Varga I, Laos M, Cseh A, Bell N, Valkonen JP, Hyvönen J. 2013. Advances in plant gene-targeted and functional markers: A review. Plant Methods 9: 1–32.
Rautengarten C, Usadel B, Neumetzler L, Hartmann J. Büssis D, Altmann T. 2008. A subtilisin‐like serine protease essential for mucilage release from Arabidopsis seed coats. The Plant Journal 54(3): 466–480.
Rivera R, Edwards KJ, Barker J, Arnold GM, Ayad G, Hodgkin T, Karp A. 1999. Isolation and characterization of polymorphic microsatellites in Cocos nucifera L. Genome 42: 668–675.
Robinson MD, McCarthy DJ, Smyth GK. 2010. EdgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26: 139–140.  
Rozen S, Skaletsky H. 2000. Primer3 on the WWW for general users and for biologist programmers. Methods in Molecular Biology 132: 365–386.
Salmani MH, Rehman S, Zaidi K, Hasan AK. 2015. Study of ignition characteristics of microemulsion of coconut oil under off diesel engine conditions. Eng. Sci. Technol. Int. J 18: 318–324.
Schulz MH, Zerbino DR, Vingron M, Birney E. 2012. Oases: Robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28: 1086–1092.
STADLER R, LAUTERBACH C, SAUER N. 2005. Cell-to-cell movement of green fluorescent protein reveals post-phloem transport in the outer integument and identifies symplastic domains in Arabidopsis seeds and embryos. Plant Physiology 139(2): 701–712.
Sebaa HS, Harche MK. 2014. Anatomical structure and ultrastructure of the endocarp cell walls of Argania spinosa (L.) Skeels (Sapotaceae). Micron 67: 100–106.
Tamagnone L, Merida A, Parr A, Mackay S, Culianez-Macia FA, Roberts K, Martin C. 1998. The AmMYB308 and AmMYB330 transcription factors from Antirrhinum regulate phenylpropanoid and lignin biosynthesis in transgenic tobacco. The Plant Cell 10(2): 135–154.
Tang S, Lomsadze A, Borodovsky M. 2015. Identification of protein coding regions in RNA transcripts. Nucleic Acids Res. 43: e78–e78.
TATUSOV RL, GALPERIN MY, NATALE DA, KOONIN EV. 2000. The COG database: A tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28(1): 33–36.
Teulat B, Aldam C, Trehin R, Lebrun P, Barker JHA, Arnold GM, Karp A, Baudouin L, Rognon F. 2000. An analysis of genetic diversity in coconut (Cocos nucifera) populations from across the geographic range using sequence-tagged microsatellites (SSRs) and AFLPs. Theor. Appl. Genet. 100: 764–771.
THIEL J, ROLLETSCHEK H, FRIEDEL S, LUNN JE, NGUYEN TH, FEIL R, TSCHIERSCH H, MÜLLER M, BORISJUK L. 2011. Seed-specific elevation of non-symbiotic hemoglobin AtHb1: beneficial effects and underlying molecular networks in Arabidopsis thaliana. BMC Plant Biology 11(1): 48.
Thiel T, Michalek W, Varshney R. Graner A. 2003. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theo Appl Genet. 106: 411.
To JP, Reiter WD, Gibson SI. 2002. Mobilization of seed storage lipid by Arabidopsis seedlings is retarded in the presence of exogenous sugars. BMC Plant Biol. 2: 4.
Varshney RK, Graner A, Sorrells ME. 2005. Genic microsatellite markers in plants: Features and applications. Trends Biotechnol. 23: 48–55.
Vigeolas H, van Dongen JT, Waldeck P, Hühn D, Geigenberger P. 2003. Lipid Storage Metabolism Is Limited by the Prevailing Low Oxygen Concentrations within Developing Seeds of Oilseed Rape. Plant Physiol. 133: 2048–2060.
Wang Y, Coleman-Derr D, Chen G, Gu YQ. 2015. OrthoVenn: A web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res. 43(78–84).
Wolf S, Mouille G, Pelloux J. 2009. Homogalacturonan methyl-esterification and plant development. Molecular Plant 2(5): 851–860.
XIE Y, WU G, TANG J, LUO R, PATTERSON J, LIU S, HUANG W, HE G, GU S, LI S, ZHOU X, LAM TW, LI Y, XU X, WONG GK, WANG J. 2014. SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads. Bioinformatics 12: 1660–1666.
Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, Wang J, Li S, Li R, Bolund L, Wang J. 2006. WEGO: A web tool for plotting GO annotations. Nucleic Acids Res. 34, W293–W297.
Young FVK. 1983. Palm Kernel and coconut oils: Analytical characteristics, process technology and uses. J Am. Oil Chem. Soc. 60: 374–379.
Zdobnov EM, Apweiler R. 2001. InterProScan – An integration platform for the signature-recognition methods in InterPro. Bioinformatics 17: 847–848.