Tampere University of Technology

TUTCRIS Research Portal

Generative modeling for maximizing precision and recall in information visualization

Research output: Contribution to journalArticleScientificpeer-review


Original languageEnglish
Pages (from-to)579-587
Number of pages9
JournalJournal of Machine Learning Research
Publication statusPublished - 2011
Publication typeA1 Journal article-refereed


Information visualization has recently been formulated as an information retrieval problem, where the goal is to find similar data points based on the visualized nonlinear projection, and the visualization is optimized to maximize a compromise between (smoothed) precision and recall. We turn the visualization into a generative modeling task where a simple user model parameterized by the data coordinates is optimized, neighborhood relations are the observed data, and straightforward maximum likelihood estimation corresponds to Stochastic Neighbor Embedding (SNE). While SNE maximizes pure recall, adding a mixture component that "explains away" misses allows our generative model to focus on maximizing precision as well. The resulting model is a generative solution to maximizing tradeoffs between precision and recall. The model outperforms earlier models in terms of precision and recall and in external validation by unsupervised classification.