MapReduce based Distributed Graph Grep using Edge Occurrence Index
DOI:
https://doi.org/10.61841/pbts2977Keywords:
graph query; graph dataset; bigdata; parallel processing; MapReduce; Distributed graph query processing; Join techniqueAbstract
Graph query processing is a very important application for the graph data. The graph data set size increases day by day due to digitization of all types of data, in order to process the large amount of graph data using number of machines not by single machine. Graph query processing using distributed frameworks like Hadoop is a challenging task. Many users are giving graph queries to process in distributed environment, in an interactive way it has to process all the queries. It becomes hard to process graph queries from a big graph dataset. This paper mainly emphasis on processing of multiple graph queries over a large set of graphs, using MapReduce framework. We introduced edge occurrence index to process multiple queries using filter and verify technique in MapReduce. We are using structure based graph partitioning to distribute all the graphs to the machines in the cluster based on structure of the graphs. The proposed algorithm is called as MapReduce based Distributed Graph Grep using Edge Occurrence Index MRDGG. Extensive experimental result analysis on various real-world graph datasets proved that the proposed work improves the performance and reduces the time for multiple graph query processing for massive collections of graphs.
Downloads
References
[1] Aggarwal, C.C., Wang, H(eds): Managing and Mining graph data. Kluwer Academic
Publishers, Dordrecht (2010)
[2] Mansurul A Bhuiyan, Mohammad AI Hasan : MIRAGE An Iterative Map Reduce based
Frequent Subgraph Mining Algorithm 2013.
[3] M. Kuramochi and G. Karypis, “Finding frequent patterns in a large sparse graph*,” Data
mining and knowledge discovery, vol. 11, no. 3, pp.243–271, 2005
[4] Giugno, R., Shasha, D: Graph Grep: A Fast and Universal Method for Querying Graphs.
Proceedings of ICPR 2, 112-115 (2002)
[5] Yan,X.,Yu,P., Han, J.: Graph Indexing Based on Discriminative Frequent Structure Analysis.
ACM Transactions on Database Systems 30(4),960-993(2005)
[6] Cheng, J., Ke, Y., Ng, W., Lu,.: FG-Index: towards verification-free query processing on graph
databases in: Proceedings of ICDE (2007)
[7] He,H.,Singh, A.K.:Closure-Tree.: An Index Structure for Graph Queries. In Graphs,
Proceedings of ICDE (2006)
[8] Haoliang Jiang ; Haixun Wang ; Yu, P.S. Shuigeng Zhou,: GString: A Novel Approach for
Efficient Search in Graph Databases Proceedings of ICDE (2007)
[9] Song-Hyon Kim, Kyong-Ha Lee, Hyebong Choi and Yoon-Joon Lee, “Parallel
Processing of Multiple Graph Queries Using MapReduce”,DBKDA,2013
[10] Fathimabi Shaik, R.B.V. Subramanyam, and D.V.L.N. Somaya- julu. ”MSP: Multiple Sub-graph
Query Processing using Structure-based Graph Partitioning Strategy and Map-Reduce.” Journal of
King Saud University-Computer and Information Sciences (2016).
[11] Fathimabishaik, R.B.V. Subramanyam, D.V.L.N Somayajulu,”Map- Reduce based Multiple
SubGraph Enumeration Using Dominating-Set Graph Partition”, International Journal of
Information Engineering and Electronic Business(IJIEEB), Vol.9, No.2, pp.36-44, 2017.
DOI:10.5815/ijieeb.2017.02.05.
[12] Willet, P.: Chemical similarity searching. J.Chem. Info. Comput.Sci.
38,983-996(1998)
[13] Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large cluster. In:
Proceedings of OSDI, pp 137-150(2004)
[14] James Cheng, YipingKe, Ada Wai-chee Fu, Jeffrey Xu Yu: Fast Graph Query Processing with a
Low-Cost Index.VLDB 2011.
[15] YifengLuo,Jihng Gun ndShugeng Zhou, Towards Efficient Subgraph- Search in Cloud
Computing Environments. Spinger-verlag Berlin Heidel- berg 2011.
[16] G. Malewicz, M. Austern, A. Bik, J. Dehnert, I.Horn,N. Leiser, and G. Czajkowski,
“Pregel: a system for large-scale graph processing,” in Proceedings of the 2010 interna- tional
conference on Management of data. ACM, 2010, pp.135–146.
[17] U. Kang, C. Tsourakakis, and C. Faloutsos, “Pegasus: mining peta- scale graphs,”
Knowledge and information systems, vol. 27, no. 2, pp.303–325, 2011.
[18] S. Ghemawat, H.Gobioff, S.T.Leung. The Google File System, in:proceedings of the
19th ACM Symposium on Operating Systems Prin- ciples, vol.37 of SOSP ’03, ACM, New York,
USA, 2003.
[19] S. Blanas, J. Patel, V. Ercegovac, J. Rao, E.Shekita, and Y. Tian, “A comparison of join
algorithms for log processing in mapreduce,” in Proceedings of the 2010 international conference
on Management of data. ACM, 2010, pp. 975–986
[20] F. N. Afrati, D. Fotakis, and J. D. Ullman. Enumerating subgraph instances using mapreduce. In ICDE, pages 62–73, 2013
[21] Yarramalli, Sai Sravya, et al. "Digital Procurement on Systems Applications and Products (SAP)
Cloud Solutions." 2020 Second International Conference on Inventive Research in Computing
Applications (ICIRCA). IEEE, 2020.
[22] S. Fathimabi, E. Jangam and A. Srisaila, "MapReduce based Heart Disease Prediction System,"
2021 8th International Conference on Computing for Sustainable Global Development (INDIACom),
2021, pp. 281-286, doi: 10.1109/INDIACom51348.20
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Author
This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.