Scalable optimization of neighbor embedding for visualization
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Scientific › peer-review
Details
Original language | English |
---|---|
Title of host publication | 30th International Conference on Machine Learning, ICML 2013 |
Publisher | International Machine Learning Society (IMLS) |
Pages | 786-794 |
Number of pages | 9 |
Edition | PART 1 |
Publication status | Published - 2013 |
Publication type | A4 Article in a conference publication |
Event | 30th International Conference on Machine Learning, ICML 2013 - Atlanta, GA, United States Duration: 16 Jun 2013 → 21 Jun 2013 |
Conference
Conference | 30th International Conference on Machine Learning, ICML 2013 |
---|---|
Country | United States |
City | Atlanta, GA |
Period | 16/06/13 → 21/06/13 |
Abstract
Neighbor embedding (NE) methods have found their use in data visualization but are limited in big data analysis tasks due to their O(n2) complexity for n data samples. We demonstrate that the obvious approach of subsampling produces inferior results and propose a generic approximated optimization technique that reduces the NE optimization cost to O(n log n). The technique is based on realizing that in visualization the embedding space is necessarily very low-dimensional (2D or 3D), and hence efficient approximations developed for n-body force calculations can be applied. In gradient-based NE algorithms the gradient for an individual point decomposes into "forces" exerted by the other points. The contributions of close-by points need to be computed individually but far-away points can be approximated by their "center of mass", rapidly computable by applying a recursive decomposition of the visualization space into quadrants. The new algorithm brings a significant speed-up for medium-size data, and brings "big data" within reach of visualization.