TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Generative modeling for maximizing precision and recall in information visualization

Tutkimustuotosvertaisarvioitu

Yksityiskohdat

AlkuperäiskieliEnglanti
Sivut579-587
Sivumäärä9
JulkaisuJournal of Machine Learning Research
Vuosikerta15
TilaJulkaistu - 2011
OKM-julkaisutyyppiA1 Alkuperäisartikkeli

Tiivistelmä

Information visualization has recently been formulated as an information retrieval problem, where the goal is to find similar data points based on the visualized nonlinear projection, and the visualization is optimized to maximize a compromise between (smoothed) precision and recall. We turn the visualization into a generative modeling task where a simple user model parameterized by the data coordinates is optimized, neighborhood relations are the observed data, and straightforward maximum likelihood estimation corresponds to Stochastic Neighbor Embedding (SNE). While SNE maximizes pure recall, adding a mixture component that "explains away" misses allows our generative model to focus on maximizing precision as well. The resulting model is a generative solution to maximizing tradeoffs between precision and recall. The model outperforms earlier models in terms of precision and recall and in external validation by unsupervised classification.