You are here

Analysis of the Semi-synchronous Approach to Large-scale Parallel Community Finding

Authors: 

Erika Duriakova, Neil Hurley, Deepak Ajwani, Alessandra Sala

Publication Type: 
Refereed Conference Meeting Proceeding
Abstract: 
Community-finding in graphs is the process of identifying highly cohesive vertex subsets. Recently the vertex-centric approach has been found effective for scalable graph processing and is implemented in systems such as GraphLab and Pregel. In the vertex-centric approach, the analysis is decomposed into a set of local computations at each vertex of the graph, with results propagated to neighbours along the vertex’s edges. Many community finding algorithms are amenable to this approach as they are based on the optimisation of an objective through a process of iterative local update (ILU), in which vertices are successively moved to the community of one of their neighbours in order to achieve the highest local gain in the quality of the objective. The sequential processing of such iterative algorithms generally benefits from an asynchronous approach, where a vertex update uses the most recent state as generated by the previous update of vertices in its neighbourhood. When vertices are distributed over a parallel machine, the asynchronous approach can encounter race conditions that impact on its performance and destroy the consistency of the results. Alternatively, a semi-synchronous approach ensures that only non-conflicting vertices are updated simultaneously. In this paper we study the semi-synchronous approach to ILU algorithms for community finding on social networks. Because of the heavy-tailed vertex distribution, the order in which vertex updates are applied in asynchronous ILU can greatly impact both convergence time and quality of the found communities. We study the impact of ordering on the distributed label propagation and modularity maximisation algorithms implemented on a shared-memory multicore architecture. We demonstrate that the semi-synchronous ILU approach is competitive in time and quality with the asynchronous approach, while allowing the analyst to maintain consistent control over update ordering. Thus, our implementation results in a more robust and predictable performance and provides control over the order in which the node labels are updated, which is crucial to obtaining the correct trade-off between running time and quality of communities on many graph classes.
Conference Name: 
ACM Conference on Online Social Network Analysis (COSN'14)
Proceedings: 
ACM Conference on Online Social Network Analysis (COSN'14)
Digital Object Identifer (DOI): 
10.na
Publication Date: 
01/10/2014
Conference Location: 
Ireland
Institution: 
National University of Ireland, Dublin (UCD)
Project Acknowledges: 
Open access repository: 
Yes
Publication document: