Radovanovic, Milos and Nanopoulos, Alexandros and Ivanovic, Mirjana (2010) Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data. Journal of Machine Learning Research, 11 (sept). pp. 2487-2531. ISSN 1532-4435 (FP7- 215006)
|
Text
215006-2010-2487-Radovanovic.pdf - Published Version Download (1063Kb) | Preview |
Abstract
Different aspects of the curse of dimensionality are known to present serious challenges to various machine-learning methods and tasks. This paper explores a new aspect of the dimensionality curse, referred to as hubness, that affects the distribution of k-occurrences: the number of times a point appears among the k nearest neighbors of other points in a data set. Through theoretical and empirical analysis involving synthetic and real data sets we show that under commonly used assumptions this distribution becomes considerably skewed as dimensionality increases, causing the emergence of hubs, that is, points with very high k-occurrences which effectively represent “popular” nearest neighbors. We examine the origins of this phenomenon, showing that it is an inherent property of data distributions in high-dimensional vector space, discuss its interaction with dimensionality reduction, and explore its influence on a wide range of machine-learning tasks directly or indirectly based on measuring distances, belonging to supervised, semi-supervised, and unsupervised learning families.
Keywords
nearest neighbors; curse of dimensionality; classification; semi-supervised learning; clusteringFunders
Serbian Ministry of ScienceEU
Projects
Abstract Methods and Applications in Computer Science, no. 144017AMyMedia
Item Type: | Article |
---|---|
FP7 Grant Agreement Number: | 215006 |
FrameWork Programmes: | SP1-Cooperation |
Scientific Areas: | Information and Communication Technologies |
Contact Email Address: | radacha@dmi.uns.ac.rs |
Last Modified: | 26 Sep 2012 09:57 |
Access rights: | Open access |
Output type: | Article |
URI: | http://eprints.kobson.nb.rs/id/eprint/30 |
Actions (login required)
View Item |