4.8 Article

The Geometry of Nonlinear Embeddings in Kernel Discriminant Analysis

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2022.3192726

Keywords

Kernel; Sociology; Covariance matrices; Linear discriminant analysis; Geometry; Eigenvalues and eigenfunctions; Principal component analysis; Discriminant analysis; feature map; Gaussian kernel; polynomial kernel; Rayleigh quotient; spectral analysis

Ask authors/readers for more resources

Fisher's linear discriminant analysis is limited to linear features, while kernel discriminant analysis overcomes this limitation with nonlinear feature mapping. This study examines the geometry of nonlinear embeddings in discriminant analysis using polynomial and Gaussian kernels. The discriminant function is obtained by solving a generalized eigenvalue problem with covariance operators. The results provide insight into the interaction between data distribution and kernel in determining the nonlinear embedding for discrimination, guiding the choice of kernel and its parameters.
Fisher's linear discriminant analysis is a classical method for classification, yet it is limited to capturing linear features only. Kernel discriminant analysis as an extension is known to successfully alleviate the limitation through a nonlinear feature mapping. We study the geometry of nonlinear embeddings in discriminant analysis with polynomial kernels and Gaussian kernel by identifying the population-level discriminant function that depends on the data distribution and the kernel. In order to obtain the discriminant function, we solve a generalized eigenvalue problem with between-class and within-class covariance operators. The polynomial discriminants are shown to capture the class difference through the population moments explicitly. For approximation of the Gaussian discriminant, we use a particular representation of the Gaussian kernel by utilizing the exponential generating function for Hermite polynomials. We also show that the Gaussian discriminant can be approximated using randomized projections of the data. Our results illuminate how the data distribution and the kernel interact in determination of the nonlinear embedding for discrimination, and provide a guideline for choice of the kernel and its parameters.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available