Mechanism of heuristic strategy based partition algorithm for massive semantic data flow

CHEN Feng-jiao, FU Hai-dong, WU Gang, GU Jin-guang

Systems Engineering - Theory & Practice ›› 2014, Vol. 34 ›› Issue (s1) : 248-254.

PDF(921 KB)
PDF(921 KB)
Systems Engineering - Theory & Practice ›› 2014, Vol. 34 ›› Issue (s1) : 248-254. DOI: 10.12011/1000-6788(2014)s1-248

Mechanism of heuristic strategy based partition algorithm for massive semantic data flow

  • CHEN Feng-jiao1,2, FU Hai-dong1,2, WU Gang1,2, GU Jin-guang1,2
Author information +
History +

Abstract

The dramatic growth of massive semantic data has brought about great challenges to the distributed storage for large data. The core technology of distributed storage is graph partitioning. This paper discussed the partition mechanism for graph data flow and the strategy of partitioning heuristic function, and the partitioning algorithm and implementation process of graph data flow for the RDF file. The experiment verified the validity of the partitioning algorithm for graph data flow by comparing with METIS (a multi-level graph partitioning algorithm) and hash partitioning methods, using several true RDF datasets.

Key words

graph partition / graph data flow / heuristic function / RDF datasets

Cite this article

Download Citations
CHEN Feng-jiao , FU Hai-dong , WU Gang , GU Jin-guang. Mechanism of heuristic strategy based partition algorithm for massive semantic data flow. Systems Engineering - Theory & Practice, 2014, 34(s1): 248-254 https://doi.org/10.12011/1000-6788(2014)s1-248

References

[1] 黄智生, 钟宁. 海量语义数据处理——平台、技术与应用[M].北京: 高等教育出版社, 2012.
[2] 于戈,谷峪,鲍玉斌,等.云计算环境下的大规模图数据处理技术[J].计算机学报, 2011, 34(10): 1753-1767.Yu Ge, Gu Yu, Bao Yubin, et al. Large scale graph data processing on cloud computing environments[J]. Chinese Journal of Computers, 2011, 34(10): 1753-1767.
[3] Pothen A. Graph partitioning algorithms with applications to scientific computing. Parallel Numerical Algorithms[M]. Kluwer Academic Press, 1996.
[4] Ou C W, Ranka S. Parallel incremental graph partitioning using linear programming[J]. Proceedings Supercomputing'94, 1994, 14(18): 458-467.
[5] Chan T F, Gilbert J R, Teng S H. Geometric spectral partitioning[R]. Xerox Palo Alto Research Center Technical Report, 1994.
[6] Chung Y, Ranka S. Mapping finite element graphs on hypercubes[J]. Journal of Supercomputing, 1992, 6(3): 257-282.
[7] Hendrickson B, Leland R. An improved spectral graph partitioning algorithm for mapping parallel computations[J]. SIAM Journal of Scientific Computing, 1995, 16(2): 452-469.
[8] Barnard S, Simon H. A fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems[C]//Proceedings 6th SIAM Conference Parallel Processing for Scientific Computing, 1993: 711-718.
[9] 王会青,陈俊杰.基于图划分的谱聚类方法的研究[J].计算机工程与设计, 2011, 32(1): 289-292.Wang Huiqing, Chen Junjie. Research of spectral clustering based on graph partition[J]. Computer Engineering and Design, 2011, 32(1): 289-292.
[10] Goehring T, Saad Y. Heuristic algorithms for automatic graph partitioning[J]. University of Minnesota Supercomputer Institute, Minneapolis, MN, 1994, 55415: 94-29.
[11] Kernighan B, Lin S. An efficient heuristic procedure for partitioning graphs[J]. The Bell System Technical Journal, 1970, 49(2): 291-307.
[12] Hendrickson B, Leland R. A multi-level algorithm for partitioning graphs[C]//Proceedings Supercomputing'95, 1995: 28.
[13] Hauck S, Borriello G. An evaluation of bipartitioning technique[C]//Proceedings Chapel Hi 11 Conference on Advanced Research in VLSI, 1995.
[14] Karypis G, Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs[J]. SIAM Journal on Scientific Computing, 1998, 20(1): 359-392.
[15] Karypis G, Kumar V. Multilevel k-way partitioning scheme for irregular graphs[J]. Journal of Parallel and Distributed Computing, 1998, 48(1): 96-129.
[16] 王启冬.基于数学规划的图划分模型研究[D].大连:大连理工大学, 2009.Wang Qidong. Research on mathematical programming based graph partitioning model[D]. Dalian: Dalian University of Technology, 2009.
[17] Karypis G, Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs[J]. SIAM Journal on Scientific Computing, 1998, 20(1): 359-392.
[18] Devine K, Boman E. Parallel hypergraph partitioning for scientific computing[C]//Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS'06), IEEE, 2006.
[19] Devine K, Boman E. Hypergraph-based dynamic load balancing for adaptive Scientific computations[C]//Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS'08), IEEE, 2008.
[20] 冯国栋,肖仰华.大图的分布式存储[J].中国计算机学会通讯, 2012, 8(11): 12-12.
[21] Franke C, Morin S, Chebotko A, et al. Distributed semantic web data management in HBase and MySQL cluster[C]//2011 IEEE International Conference on Cloud Computing (CLOUD), IEEE, 2011: 105-112.
[22] RDF Semantics[EB/OL]. http://www.w3.org/TR/2004/REC-rdf-mt-20040210/.
[23] SNOMED CT[EB/OL]. http://www.ihtsdo.org/snomed-ct/.
PDF(921 KB)

Accesses

Citation

Detail

Sections
Recommended

/