Data Mining Itemset Parallelization and Distribution Using Mapreduce Approach

Ms.Shreedevi C Patil; Prof.B.N.Veerappa B.E,M.TECH

Authors

Ms.Shreedevi C Patil P.G.Student Department of studies in Computer science and engineering UBDT College of Engineering, Davanagere
Prof.B.N.Veerappa B.E,M.TECH ASSOCIATE PROFESSOR Department of studies in Computer science and engineering UBDT College of Engineering, Davanagere

Keywords:

Frequent itemsets, frequent items ultrametric tree (FIU-tree), Hadoop cluster, load balance, MapReduce

Abstract

Existing parallel mining algorithms for frequent itemsets unavailable for the mechanism that renders
automatic parallelization, load balancing, data distribution, and fault tolerance on large clusters. As a solution to this
problem, we build a parallel frequent itemsets mining algorithm called FiDoop using the MapReduce programming
model. To achieve compressed storage and keep away from Sbuilding conditional pattern bases, FiDoop introduce the
frequent items ultrametric tree, rather than conventional FP trees. In FiDoop, three MapReduce jobs are
implemented to complete the mining task. In the importance of third MapReduce job, the mappers independently
decompose itemsets, the reducers perform combination operations by constructing small ultrametric trees, and the
actual mining of these trees separately. We implement FiDoop on our in-house Hadoop cluster. We prove that FiDoop
on the cluster is sensitive to information distribution and dimensions, because itemsets with distinct lengths have
distinct decomposition and construction costs. To improve FiDoop’s performance, we develop a workload balance
metric to measure load balance across the cluster’s computing nodes. We develop FiDoop-HD, an extension of
FiDoop, to speed up the mining performance for high-dimensional data analysis.

Data Mining Itemset Parallelization and Distribution Using Mapreduce Approach

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

google scholar

plagiarism

Make a Submission

Current Issue

Information