TY - JOUR
T1 - Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets
AU - Hassan, Bryar A.
AU - Rashid, Tarik A.
AU - Mirjalili, Seyedali
N1 - Funding Information:
The researchers would like to express their gratitude to the referees for their insightful comments. Based on their feedback, the technical content of this paper has been greatly improved. Meanwhile, the authors wish to express their gratitude to University of Kurdistan Hewler, Centre for Artificial Intelligence Research and Optimisation and Kurdistan Institution for Strategic Studies and Scientific Research and for their continued support in conducting this research.
Publisher Copyright:
© 2021
PY - 2021/6
Y1 - 2021/6
N2 - This article presents the data used to evaluate the performance of evolutionary clustering algorithm star (ECA*) compared to five traditional and modern clustering algorithms. Two experimental methods are employed to examine the performance of ECA* against genetic algorithm for clustering++ (GENCLUST++), learning vector quantisation (LVQ), expectation maximisation (EM), K-means++ (KM++) and K-means (KM). These algorithms are applied to 32 heterogenous and multi-featured datasets to determine which one performs well on the three tests. For one, ther paper examines the efficiency of ECA* in contradiction of its corresponding algorithms using clustering evaluation measures. These validation criteria are objective function and cluster quality measures. For another, it suggests a performance rating framework to measurethe the performance sensitivity of these algorithms on varos dataset features (cluster dimensionality, number of clusters, cluster overlap, cluster shape and cluster structure). The contributions of these experiments are two-folds: (i) ECA* exceeds its counterpart aloriths in ability to find out the right cluster number; (ii) ECA* is less sensitive towards dataset features compared to its competitive techniques. Nonetheless, the results of the experiments performed demonstrate some limitations in the ECA*: (i) ECA* is not fully applied based on the premise that no prior knowledge exists; (ii) Adapting and utilising ECA* on several real applications has not been achieved yet.
AB - This article presents the data used to evaluate the performance of evolutionary clustering algorithm star (ECA*) compared to five traditional and modern clustering algorithms. Two experimental methods are employed to examine the performance of ECA* against genetic algorithm for clustering++ (GENCLUST++), learning vector quantisation (LVQ), expectation maximisation (EM), K-means++ (KM++) and K-means (KM). These algorithms are applied to 32 heterogenous and multi-featured datasets to determine which one performs well on the three tests. For one, ther paper examines the efficiency of ECA* in contradiction of its corresponding algorithms using clustering evaluation measures. These validation criteria are objective function and cluster quality measures. For another, it suggests a performance rating framework to measurethe the performance sensitivity of these algorithms on varos dataset features (cluster dimensionality, number of clusters, cluster overlap, cluster shape and cluster structure). The contributions of these experiments are two-folds: (i) ECA* exceeds its counterpart aloriths in ability to find out the right cluster number; (ii) ECA* is less sensitive towards dataset features compared to its competitive techniques. Nonetheless, the results of the experiments performed demonstrate some limitations in the ECA*: (i) ECA* is not fully applied based on the premise that no prior knowledge exists; (ii) Adapting and utilising ECA* on several real applications has not been achieved yet.
KW - ECA performance evaluation
KW - ECA performance ranking framework
KW - ECA statistical performance evaluation
KW - Evolutionary clustering algorithm star
UR - http://www.scopus.com/inward/record.url?scp=85104462944&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2021.107044
DO - 10.1016/j.dib.2021.107044
M3 - Article
AN - SCOPUS:85104462944
SN - 2352-3409
VL - 36
JO - Data in Brief
JF - Data in Brief
M1 - 107044
ER -