来源:软件工程学院

华东师范大学可信计算论坛系列报告

来源:华东师范大学软件工程学院发布时间:2013-05-22浏览次数:4775

报告题目:I/O Efficient: Computing SCCs in Massive Graphs

报告人:Jeffrey Xu Yu 教授(香港大学)

报告时间:524 15:0016:30

报告地点:中北校区数学馆201

主持人:金澈清 教授

报告摘要:

A strongly connected component (SCC) is a maximal subgraph of a directed graph G in which every pair of nodes are reachable from each other in the SCC. With such a property, a general directed graph can be represented by a directed acyclic graph (DAG) by contracting an SCC of G to a node in DAG. In many real applications that need graph pattern matching, topological sorting, or reachability query processing, the best way to deal with a general directed graph is to deal with its DAG representation. Therefore, finding all SCCs in a directed graph G is a critical operation.  The existing in-memory algorithms based on depth first search (DFS) can find all SCCs in linear time w.r.t. the size of a graph. However, when a graph cannot resident entirely in the main memory, the existing external or semi-external algorithms to find all SCCs have limitation to achieve high I/O efficiency.  In this talk, we discuss new I/O efficient semi-external algorithms to find all SCCs for a massive directed graph G that cannot reside in main memory entirely. To overcome the deficiency of the existing DFS based semi-external algorithm that heavily relies on a total order, we explore a weak order based on which we investigate new algorithms.  We give a new two phase algorithm, namely, tree construction and tree search. In the tree construction phase, a spanning tree of G can be constructed in bounded sequential scans of G.  In the tree search phase, it needs to sequentially scan the graph once to find all SCCs.  In addition, we discuss a new single phase algorithm, which combines the tree construction and tree search phases into a single phase, with three new optimization techniques. They are early acceptance, early rejection, and batch processing.  By the single phase algorithm with the new optimization techniques, we can significantly reduce the number of I/Os and CPU cost. 

 

个人简介:

Dr Jeffrey Xu Yu is a Professor in the Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong. His current main research interests include graph mining, graph query processing, graph pattern matching, and keywords search in relational databases. Dr. Yu served as an Information Director and a member in ACM SIGMOD executive committee (2007-2011), and an associateeditor of IEEE Transactions on Knowledge and Data Engineering (2004-2008). Currently he servers as an associate editor in VLDB Journal, WWW Journal, the International Journal of Cooperative Information Systems, and the journal on Health Information Science and Systems (HISS). Dr. Yu served/serves in over 280 organization committees and program committees in international conferences/workshops including PC Co-chair of APWeb'04, WAIM'06,APWeb/WAIM'07, WISE'09, PAKDD'10, DASFAA'11, ICDM'12, and NDBC'13, and Conference Co-Chair of APWeb'13.  He has published over 280 papers including papers published in reputed journals and major international conferences including ACM TODS, VLDB J, IEEE TKDE, ACM SIGMOD, ACM SIGKDD, VLDB, IEEE ICDE, and IEEE ICDM.