Repository logo
 
No Thumbnail Available
Publication

Linear discriminant analysis with more variables than observations: a not so naive approach

Use this identifier to reference this record.
Name:Description:Size:Format: 
LINEAR~1.PDF141.94 KBAdobe PDF Download

Advisor(s)

Abstract(s)

A new linear discrimination rule, designed for two-group problems with many correlated variables, is proposed. This proposal tries to incorporate the most important patterns revealed by the empirical correlations while approximating the optimal Bayes rule as the number of variables grows without limit. In order to achieve this goal the new rule relies on covariance matrix estimates derived from Gaussian factor models with small intrinsic dimensionality. Asymptotic results show that, when the model assumed for the covariance matrix estimate is a reasonable approximation to the true data generating process, the expected error rate of the new rule converges to an error close to that of the optimal Bayes rule, even in several cases where the number of variables grows faster than the number of observations. Simulation results suggest that the new rule clearly outperforms both Fisher's and Naive linear discriminant rules in the data conditions it was designed for.

Description

Keywords

Discriminant analysis Naive bayes Minimax regret

Pedagogical Context

Citation

DUARTE SILVA, A.P. - Linear discriminant analysis with more variables than observations: a not so naive approach. In 11TH IFCS BIENNIAL CONFERENCE AND 33RD ANNUAL CONFERENCE OF THE GESELLSCHAFT FÜR KLASSIFIKATION E.V, Dresden, 13-18 March, 2009 - Classification as a Tool for Research Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft für Klassifikation e.V., Dresden, March 13-18, 2009. New York: Springer Verlag, 2010. (Studies in Classification, Data Analysis, and Knowledge Organization). ISBN 978-3-642-10744-3. e-ISBN 978-3-642-10745-0. p. 227-234

Research Projects

Organizational Units

Journal Issue

Publisher

Springer Verlag

CC License

Altmetrics