TY - JOUR
T1 - Incremental Gaussian Discriminant Analysis based on Graybill and Deal weighted combination of estimators for brain tumour diagnosis
AU - Tortajada, Salvador
AU - Fuster-Garcia, Elies
AU - Vicente, Javier
AU - Wesseling, Pieter
AU - Howe, Franklyn A.
AU - Julià-Sapé, Margarida
AU - Candiota, Ana Paula
AU - Monleón, Daniel
AU - Moreno-Torres, Àngel
AU - Pujol, Jesús
AU - Griffiths, John R.
AU - Wright, Alan
AU - Peet, Andrew C.
AU - Martínez-Bisbal, M. Carmen
AU - Celda, Bernardo
AU - Arús, Carles
AU - Robles, Montserrat
AU - García-Gómez, Juan Miguel
N1 - Funding Information:
This work has been partially funded by the Spanish Instituto de Salud Carlos III (ISCiii) through the RETICS Combiomed (RD07/0067/2001). The authors thank the Programa Torres Quevedo from Ministerio de Educación y Ciencia, co-founded by the European Social Fund (PTQ05-02-03386 and PTQ08-01-06802). We thank eTUMOUR, HEALTHAGENTS and INTERPRET partners for providing data, in particular W. Gajewicz (MUL), J. Calvar (FLENI), A. Heerschap (RUNMC), J. Capellades (IDI-Badalona), C. Majós (IDI-Bellvitge), and W. Semmier (DKFZ-Heidelberg). CIBER-BBN is an initiative funded by the VI National R&D&I Plan 2008-2011, CIBER Actions are financed by the Instituto de Salud Carlos III with assistance from the European Regional Development Fund.
PY - 2011/8
Y1 - 2011/8
N2 - In the last decade, machine learning (ML) techniques have been used for developing classifiers for automatic brain tumour diagnosis. However, the development of these ML models rely on a unique training set and learning stops once this set has been processed. Training these classifiers requires a representative amount of data, but the gathering, preprocess, and validation of samples is expensive and time-consuming. Therefore, for a classical, non-incremental approach to ML, it is necessary to wait long enough to collect all the required data. In contrast, an incremental learning approach may allow us to build an initial classifier with a smaller number of samples and update it incrementally when new data are collected. In this study, an incremental learning algorithm for Gaussian Discriminant Analysis (iGDA) based on the Graybill and Deal weighted combination of estimators is introduced. Each time a new set of data becomes available, a new estimation is carried out and a combination with a previous estimation is performed. iGDA does not require access to the previously used data and is able to include new classes that were not in the original analysis, thus allowing the customization of the models to the distribution of data at a particular clinical center. An evaluation using five benchmark databases has been used to evaluate the behaviour of the iGDA algorithm in terms of stability-plasticity, class inclusion and order effect. Finally, the iGDA algorithm has been applied to automatic brain tumour classification with magnetic resonance spectroscopy, and compared with two state-of-the-art incremental algorithms. The empirical results obtained show the ability of the algorithm to learn in an incremental fashion, improving the performance of the models when new information is available, and converging in the course of time. Furthermore, the algorithm shows a negligible instance and concept order effect, avoiding the bias that such effects could introduce.
AB - In the last decade, machine learning (ML) techniques have been used for developing classifiers for automatic brain tumour diagnosis. However, the development of these ML models rely on a unique training set and learning stops once this set has been processed. Training these classifiers requires a representative amount of data, but the gathering, preprocess, and validation of samples is expensive and time-consuming. Therefore, for a classical, non-incremental approach to ML, it is necessary to wait long enough to collect all the required data. In contrast, an incremental learning approach may allow us to build an initial classifier with a smaller number of samples and update it incrementally when new data are collected. In this study, an incremental learning algorithm for Gaussian Discriminant Analysis (iGDA) based on the Graybill and Deal weighted combination of estimators is introduced. Each time a new set of data becomes available, a new estimation is carried out and a combination with a previous estimation is performed. iGDA does not require access to the previously used data and is able to include new classes that were not in the original analysis, thus allowing the customization of the models to the distribution of data at a particular clinical center. An evaluation using five benchmark databases has been used to evaluate the behaviour of the iGDA algorithm in terms of stability-plasticity, class inclusion and order effect. Finally, the iGDA algorithm has been applied to automatic brain tumour classification with magnetic resonance spectroscopy, and compared with two state-of-the-art incremental algorithms. The empirical results obtained show the ability of the algorithm to learn in an incremental fashion, improving the performance of the models when new information is available, and converging in the course of time. Furthermore, the algorithm shows a negligible instance and concept order effect, avoiding the bias that such effects could introduce.
KW - Automatic brain tumour diagnosis
KW - Graybill-Deal estimator
KW - Incremental learning
KW - Machine learning
KW - Magnetic resonance
UR - http://www.scopus.com/inward/record.url?scp=79960562027&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2011.02.009
DO - 10.1016/j.jbi.2011.02.009
M3 - Article
C2 - 21377545
AN - SCOPUS:79960562027
SN - 1532-0464
VL - 44
SP - 677
EP - 687
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
IS - 4
ER -