Performance evaluation of support vector machine (SVM)-based predictors in genomic selection
434 / 158
Keywords:
Genetic architecture, Genomic breeding values, QTL effects, SNP, Support vector machineAbstract
The aim was to compare predictive performance of SVM-based predictors constructed using different kernel functions (radial, sigmoid, linear and polynomial) in different genetic architectures of a trait (number of QTL, distribution of QTL effects) and heritability levels. To this end, a genome comprised of five chromosomes, one
Morgan each, was simulated on which 10,000 bi-allelic single nucleotide polymorphisms (SNP) were distributed.
Cross validation employing a grid search was used to tune the meta-parameters of each kernel function. Pearson’s
correlation between the true and predicted genomic breeding values (rp,t) and mean squared error of predicted
genomic breeding values (MSEp) were used, respectively, as measures of the predictive accuracy and the overall
fit. Meta-parameter optimization had a significant effect on predictive performance of SVM-based predictors in
such a way that by using improper meta-parameters, the predictive power of models decreased significantly. In all
models, the accuracy of prediction increased following increase in heritability and decrease in the number of
QTLs. In most of scenarios, radial- and sigmoid-based SVM predictors outperformed polynomial and linear models.
The linear-and polynomial-based SVM had lower rp,t and higher MSEp and, therefore, were not recommended for
genomic selection. The prediction accuracy of radial and sigmoid models was approximately the same in most of
the studied scenarios; however, considering all pros and cons of radial and sigmoid kernels, radial kernel was
recommended as the best kernel function for constructing SVM. All of studied SVM-based predictors were efficient
users of time and memory.
Downloads
References
Blondel M, Onogi A, Iwata H and Ueda N. 2015. A ranking approach to genomic selection.PLoS ONE 10(6): e0128570. DOI: https://doi.org/10.1371/journal.pone.0128570
Boser B, Guyon I and Vapnik V. 1992. An training algorithm for optimal margin classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp 263–68. 27– 29 July 1992. Pittsburgh, USA. DOI: https://doi.org/10.1145/130385.130401
Combs E and Bernardo R. 2015. Accuracy of genome wide selection for different traits with constant population size, heritability, and number of markers. Plant Genome 6: 1. Daetwyler H D, Calus M P L, Pong-Wong R, de los Campos G and Hickey J M. 2013. Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking. Genetics 193: 347–65. DOI: https://doi.org/10.1534/genetics.112.147983
Ghafouri-Kesbi F, Rahimi-Mianji G, Honarvar M and Nejati- Javaremi A. 2016. Predictive ability of random forests, boosting, support vector machines and genomic best linear unbiased prediction in different scenarios of genomic evaluation. Animal Production Science 57: 229–36. DOI: https://doi.org/10.1071/AN15538
Hastie T J, Tibshirani R and Friedman J. 2009. The Elements of Statistical Learning. 745 p, Springer, New York, USA. DOI: https://doi.org/10.1007/978-0-387-84858-7
Hayes B J and Daetwyler H D. 2015. Genomic selection.Course note. February 2015. Armidale, Australia. Available at: http://jvanderw.une.edu.au/GenomicPredictionCoursNotes Armidale2015.pdf.
Howard R, Carriquiry A L and Beavis W D. 2014. Parametric and nonparametric statistical methods for genomic selection of traits with additive and epistatic genetic architectures. Genetics 4: 1027–46. DOI: https://doi.org/10.1534/g3.114.010298
Honarvar M and Ghiasi H. 2013. A comparison of genomic predictions using support vector machines (SVMs) and GBLUP methods. Agrochimica Research 57: 3–21.
Meuwissen T H E, Hayes B J and Goddard M E. 2001. Prediction of total genetic value using genome wide densemarker maps.Genetics 157: 1819–29. DOI: https://doi.org/10.1093/genetics/157.4.1819
Meyer D, Dimitriadou E, Hornik K, Weingessel A and Leisch K. 2013. Misc functions of the department of statistics (e1071), TU Wien. Available at: http://cran.r-project.org/web/packages/e1071/index.html.
Neves H H R, Carvalheiro R and Queiroz S A. 2012. A comparison of statistical methods for genomic selection in a mice population.BMC Genetics 13: 100. DOI: https://doi.org/10.1186/1471-2156-13-100
Scholkopf B, Tsuda K and Vert J P. 2004. Kernel methods in computational biology. MIT Press series on Computational Molecular Biology. 425 pp, MIT Press. Cambridge, Massachuse. DOI: https://doi.org/10.7551/mitpress/4057.001.0001
Technow F. 2013. hypred: Simulation of genomic data in applied genetics. Available at: http://cran.r-project.org/web/packages/hypred/index.html.
Zhu Y, Tan Y, Hua Y, Wang M, Zhang G and Zhang. 2010. Feature selection and performance evaluation of support vector machine (SVM)-based classifier for differentiating benign and malignant pulmonary nodules by computed tomography. Journal of Digital Imaging 23: 51–65. DOI: https://doi.org/10.1007/s10278-009-9185-9
Downloads
Submitted
Published
Issue
Section
License
Copyright (c) 2017 The Indian Journal of Animal Sciences

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The copyright of the articles published in The Indian Journal of Animal Sciences is vested with the Indian Council of Agricultural Research, which reserves the right to enter into any agreement with any organization in India or abroad, for reprography, photocopying, storage and dissemination of information. The Council has no objection to using the material, provided the information is not being utilized for commercial purposes and wherever the information is being used, proper credit is given to ICAR.