Mixture distribution approach for identifying differentially expressed genes in microarray data of Arabidopsis thaliana
169 / 108
Keywords:
Differential gene expression, Microarray, Mixture distribution, Normal distributionAbstract
The basic aim of analyzing gene expression data is to identify genes whose expression patterns differ in the treatment samples, with respect to the control or healthy samples. Microarray technology is a tool for analyzing simultaneous relative expression of thousands of genes within a particular cell population or tissue in a single experiment through the hybridization of RNA. Present paper deals with mixture distribution approach to investigate differentially expressed genes for sequence data of Arabidopsis thaliana under two conditions, salt-stressed and control. Two-component mixture normal model was fitted to the normalized data and the parameters were estimated using EM algorithm. Likelihood Ratio Test (LRT) was performed for testing goodness-of-fit. Fitting of two-component mixture normal model was found to be capable of capturing more variability as compared to single component normal distribution and was able to identify the differentially expressed genes more accurately.Downloads
References
Anders S and Huber W. 2010. Differential expression analysis for sequence count data. Genome Biology 11(10): R106. DOI:10.1186/gb-2010-11-10-r106. DOI: https://doi.org/10.1186/gb-2010-11-10-r106
Anjum A, Jaggi S, Varghese E, Lall S, Bhowmik A and Rai A. 2016. Identification of differentially expressed genes in RNA-seq data of Arabidopsis thaliana: A compound distribution approach. Journal of Computational Biology 23(4): 239-47. DOI:10.1089/cmb.2015.0205. DOI: https://doi.org/10.1089/cmb.2015.0205
Benaglia T, Chauveau D, Hunter D and Young D. 2009. mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software 32(6): 1-29.DOI:10.18637/jss.v032.i06 DOI: https://doi.org/10.18637/jss.v032.i06
Bonafede E, Picard F, Robin S and Viroli C. 2016. Modeling over dispersion heterogeneity in differential expression analysis using mixtures. Biometrics 72(3): 804-814.DOI: 10.1111/ biom.12458 DOI: https://doi.org/10.1111/biom.12458
Brazma A and Vilo J. 2000. Gene expression data analysis. FEBS Letters 480(1): 17-24. DOI: https://doi.org/10.1016/S0014-5793(00)01772-5
Jeffery I B, Higgins D G and Culhane A C. 2006. Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. BMC Bioinformatics 7(1): 359. DOI: https://doi.org/10.1186/1471-2105-7-359
Karim R, Hossain P, Begum S and Hossain F. 2011. Rayleigh mixture distribution. Journal of Applied Mathematics. Article ID 238290, DOI:10.1155/2011/238290. DOI: https://doi.org/10.1155/2011/238290
Marioni J C, Mason C E, Mane S M, Stephens M and Gilad Y. 2008. RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Research 18(9): 1509-1517.DOI:10.1101/gr.079558.108. DOI: https://doi.org/10.1101/gr.079558.108
McLachlan G J, Bean R W and Peel D. 2002. A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18(3): 413-22.DOI: 10.1093/ bioinformatics/18.3.413. DOI: https://doi.org/10.1093/bioinformatics/18.3.413
McLachlan G and Peel D. 2000. Finite Mixture Models. New York: Wiley. DOI: https://doi.org/10.1002/0471721182
Mortazavi A, Williams B A, McCue K, Schaeffer L and Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nature Methods 5(7): 621-628. DOI:10.1038/ nmeth.1226. DOI: https://doi.org/10.1038/nmeth.1226
Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M and Snyder M. 2008. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320(5881):1344- 1349. DOI: 10.1126/science.1158441. DOI: https://doi.org/10.1126/science.1158441
Pearson K. 1895. Contributions to the mathematical theory of evolution. Philosophical Transactions of the Royal Society of London A. 185: 343-414. DOI: https://doi.org/10.1098/rsta.1895.0010
Yang Y, Tashman AP, Lee JY, Yoon S, Mao W, Ahn K, Kim W, Mendell N R, Gordon D and Finch S J. 2007. Mixture modeling of microarray gene expression data. BMC Proceedings 1(1): S50. DOI: https://doi.org/10.1186/1753-6561-1-S1-S50
Downloads
Submitted
Published
Issue
Section
License
Copyright (c) 2020 The Indian Journal of Agricultural Sciences

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The copyright of the articles published in The Indian Journal of Agricultural Sciences is vested with the Indian Council of Agricultural Research, which reserves the right to enter into any agreement with any organization in India or abroad, for reprography, photocopying, storage and dissemination of information. The Council has no objection to using the material, provided the information is not being utilized for commercial purposes and wherever the information is being used, proper credit is given to ICAR.