lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Estimating the Mahalanobis distance in high-dimensional data
Linnaeus University, School of Business and Economics, Department of Economics and Statistics.ORCID iD: 0000-0002-0789-5826
2013 (English)Conference paper, Poster (Other academic)
Abstract [en]

The Mahalanobis distance is a fundamental statistic in many fields such as Outlier detection, Normality testing and Cluster analysis. However, the standard estimator developed by Mahalanobis (1936) and Wilks (1963) is not well behaved in cases when the dimension (p) of the parent variable increases proportional to the sample size (n). This case is frequently referred to as Increasing Dimension Asymptotics (IDA). Specifically, the sample covariance matrix on which the Mahalanobis distance depends becomes degenerate under IDA settings, which in turn produce stochastically unstable Mahalanobis distances. This research project consists of several parts. It (a) shows that a previously suggested family of “improved” shrinkage estimators of the covariance matrix produce inoperable Mahalanobis distances, both under classical and increasing dimension asymptotics. It (b) develops a risk function specifically designed to assess the Mahalanobis distance and identifies good estimators thereof and (c) develops a family of resolvent-type estimators of the Mahalanobis distance. This family of estimators is shown to remain well behaved even under IDA settings. Suicient conditions for the proposed estimator to outperform the traditional estimator are also supplied. The proposed estimator is argued to be a useful tool for descriptive statistics, such as Assessment of influential values or Cluster analysis, in cases when the dimension of data is proportional to the sample size.

Place, publisher, year, edition, pages
2013.
National Category
Probability Theory and Statistics
Research subject
Statistics/Econometrics
Identifiers
URN: urn:nbn:se:lnu:diva-40914OAI: oai:DiVA.org:lnu-40914DiVA: diva2:795880
Conference
3rd joint Statistical Meeting DAGStat, Freiburg, Germany, March 18-23, 2013
Available from: 2015-03-17 Created: 2015-03-17 Last updated: 2015-05-23Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Dai, Deliang
By organisation
Department of Economics and Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

Total: 122 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf