4.7 Article

On the distance concentration awareness of certain data reduction techniques

Journal

PATTERN RECOGNITION
Volume 44, Issue 2, Pages 265-277

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2010.08.018

Keywords

Distance concentration; Dimensionality reduction; Feature selection; Projection pursuit; Sure independence screening

Funding

  1. MRC [G0701858]
  2. MRC [G0701858] Funding Source: UKRI
  3. Medical Research Council [G0701858] Funding Source: researchfish

Ask authors/readers for more resources

We make a first investigation into a recently raised concern about the suitability of existing data analysis techniques when faced with the counter-intuitive properties of high dimensional data spaces, such as the phenomenon of distance concentration. Under the structural assumption of a generic linear model with a latent variable and an additive unstructured noise, we find that dimension reduction that explicitly guards against distance concentration recovers the well-known techniques of Fisher's linear discriminant analysis, Fisher's discriminant ratio and a variant of projection pursuit. Extrapolation to regression uncovers a close link to sure independence screening, which is a recently proposed technique for variable selection in ultra-high dimensional feature spaces. Hence, these techniques may be seen as distance concentration aware, despite they have not been explicitly designed to have this property. Throughout our analysis, other than the dependency structure implied by the mentioned linear model, we make no assumptions about the distributions of the variables involved. (C) 2010 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available