TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Limited random walk algorithm for big graph data clustering

Tutkimustuotosvertaisarvioitu

Standard

Limited random walk algorithm for big graph data clustering. / Zhang, Honglei; Raitoharju, Jenni; Kiranyaz, Serkan; Gabbouj, Moncef.

julkaisussa: Journal of Big Data, Vuosikerta 3, Nro 1, 2016, s. 26.

Tutkimustuotosvertaisarvioitu

Harvard

Zhang, H, Raitoharju, J, Kiranyaz, S & Gabbouj, M 2016, 'Limited random walk algorithm for big graph data clustering', Journal of Big Data, Vuosikerta. 3, Nro 1, Sivut 26. https://doi.org/10.1186/s40537-016-0060-5

APA

Vancouver

Author

Zhang, Honglei ; Raitoharju, Jenni ; Kiranyaz, Serkan ; Gabbouj, Moncef. / Limited random walk algorithm for big graph data clustering. Julkaisussa: Journal of Big Data. 2016 ; Vuosikerta 3, Nro 1. Sivut 26.

Bibtex - Lataa

@article{b1118dbd0c4442e5ad327c2a125ec8ce,
title = "Limited random walk algorithm for big graph data clustering",
abstract = "Graph clustering is an important technique to understand the relationships between the vertices in a big graph. In this paper, we propose a novel random-walk-based graph clustering method. The proposed method restricts the reach of the walking agent using an inflation function and a normalization function. We analyze the behavior of the limited random walk procedure and propose a novel algorithm for both global and local graph clustering problems. Previous random-walk-based algorithms depend on the chosen fitness function to find the clusters around a seed vertex. The proposed algorithm tackles the problem in an entirely different manner. We use the limited random walk procedure to find attractor vertices in a graph and use them as features to cluster the vertices. According to the experimental results on the simulated graph data and the real-world big graph data, the proposed method is superior to the state-of-the-art methods in solving graph clustering problems. Since the proposed method uses the embarrassingly parallel paradigm, it can be efficiently implemented and embedded in any parallel computing environment such as a MapReduce framework. Given enough computing resources, we are capable of clustering graphs with millions of vertices and hundreds millions of edges in a reasonable time.",
author = "Honglei Zhang and Jenni Raitoharju and Serkan Kiranyaz and Moncef Gabbouj",
year = "2016",
doi = "10.1186/s40537-016-0060-5",
language = "English",
volume = "3",
pages = "26",
journal = "Journal of Big Data",
issn = "2196-1115",
publisher = "Springer Verlag",
number = "1",

}

RIS (suitable for import to EndNote) - Lataa

TY - JOUR

T1 - Limited random walk algorithm for big graph data clustering

AU - Zhang, Honglei

AU - Raitoharju, Jenni

AU - Kiranyaz, Serkan

AU - Gabbouj, Moncef

PY - 2016

Y1 - 2016

N2 - Graph clustering is an important technique to understand the relationships between the vertices in a big graph. In this paper, we propose a novel random-walk-based graph clustering method. The proposed method restricts the reach of the walking agent using an inflation function and a normalization function. We analyze the behavior of the limited random walk procedure and propose a novel algorithm for both global and local graph clustering problems. Previous random-walk-based algorithms depend on the chosen fitness function to find the clusters around a seed vertex. The proposed algorithm tackles the problem in an entirely different manner. We use the limited random walk procedure to find attractor vertices in a graph and use them as features to cluster the vertices. According to the experimental results on the simulated graph data and the real-world big graph data, the proposed method is superior to the state-of-the-art methods in solving graph clustering problems. Since the proposed method uses the embarrassingly parallel paradigm, it can be efficiently implemented and embedded in any parallel computing environment such as a MapReduce framework. Given enough computing resources, we are capable of clustering graphs with millions of vertices and hundreds millions of edges in a reasonable time.

AB - Graph clustering is an important technique to understand the relationships between the vertices in a big graph. In this paper, we propose a novel random-walk-based graph clustering method. The proposed method restricts the reach of the walking agent using an inflation function and a normalization function. We analyze the behavior of the limited random walk procedure and propose a novel algorithm for both global and local graph clustering problems. Previous random-walk-based algorithms depend on the chosen fitness function to find the clusters around a seed vertex. The proposed algorithm tackles the problem in an entirely different manner. We use the limited random walk procedure to find attractor vertices in a graph and use them as features to cluster the vertices. According to the experimental results on the simulated graph data and the real-world big graph data, the proposed method is superior to the state-of-the-art methods in solving graph clustering problems. Since the proposed method uses the embarrassingly parallel paradigm, it can be efficiently implemented and embedded in any parallel computing environment such as a MapReduce framework. Given enough computing resources, we are capable of clustering graphs with millions of vertices and hundreds millions of edges in a reasonable time.

U2 - 10.1186/s40537-016-0060-5

DO - 10.1186/s40537-016-0060-5

M3 - Article

VL - 3

SP - 26

JO - Journal of Big Data

JF - Journal of Big Data

SN - 2196-1115

IS - 1

ER -