TY - GEN
T1 - MR-VDENCLUE
T2 - Intelligent Systems Conference, IntelliSys 2022
AU - Al-Naymat, Ghazi
AU - Khader, Mariam
AU - Al-Betar, Mohammed Azmi
AU - Hriez, Raghda
AU - Hadi, Ali
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - The volume of data generated, processed, and consumed in the digital world is exponentially increasing. The clustering of such a huge volume of data, known as big data, necessitates the development of highly scalable clustering methods. Density-based algorithms have attracted researchers’ interest because they help to better understand complex patterns in spatial datasets. As a result, they are capable of discovering clusters with varying shapes. However, most of the density-based algorithms are challenged by the discovery of clusters with varying density and the ability to cluster big datasets. The VDENCLUE algorithm was proposed to discover clusters with varying densities. However, VDENCLUE incurs high computation overhead, which is impractical for large datasets. In this paper, a parallel approximated variant of VDENCLUE is proposed, called MR-VDENCLUE. Besides discovering clusters with arbitrary shapes, MR-VDENCLUE can discover clusters with varying densities and scale up to handle big datasets.
AB - The volume of data generated, processed, and consumed in the digital world is exponentially increasing. The clustering of such a huge volume of data, known as big data, necessitates the development of highly scalable clustering methods. Density-based algorithms have attracted researchers’ interest because they help to better understand complex patterns in spatial datasets. As a result, they are capable of discovering clusters with varying shapes. However, most of the density-based algorithms are challenged by the discovery of clusters with varying density and the ability to cluster big datasets. The VDENCLUE algorithm was proposed to discover clusters with varying densities. However, VDENCLUE incurs high computation overhead, which is impractical for large datasets. In this paper, a parallel approximated variant of VDENCLUE is proposed, called MR-VDENCLUE. Besides discovering clusters with arbitrary shapes, MR-VDENCLUE can discover clusters with varying densities and scale up to handle big datasets.
KW - Big data
KW - Clustering
KW - DENCLUE
KW - Density clustering
KW - Distributed clustering
KW - Mapreduce framework
UR - https://www.scopus.com/pages/publications/85137976656
U2 - 10.1007/978-3-031-16072-1_55
DO - 10.1007/978-3-031-16072-1_55
M3 - Conference contribution
AN - SCOPUS:85137976656
SN - 9783031160714
T3 - Lecture Notes in Networks and Systems
SP - 771
EP - 788
BT - Intelligent Systems and Applications - Proceedings of the 2022 Intelligent Systems Conference IntelliSys Volume 1
A2 - Arai, Kohei
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 1 September 2022 through 2 September 2022
ER -