TY - GEN
T1 - Unsupervised feature selection technique based on harmony search algorithm for improving the text clustering
AU - Abualigah, Laith Mohammad
AU - Khader, Ahamad Tajudin
AU - Al-Betar, Mohammed Azmi
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/23
Y1 - 2016/8/23
N2 - The increasing amount of text information on the Internet web pages affects the clustering analysis. The text clustering is a favorable analysis technique used for partitioning a massive amount of information into clusters. Hence, the major problem that affects the text clustering technique is the presence uninformative and sparse features in text documents. The feature selection (FS) is an important unsupervised technique used to eliminate uninformative features to encourage the text clustering technique. Recently, the meta-heuristic algorithms are successfully applied to solve several optimization problems. In this paper, we proposed the harmony search (HS) algorithm to solve the feature selection problem (FSHSTC). The proposed method is used to enhance the text clustering (TC) technique by obtaining a new subset of informative or useful features. Experiments were applied using four benchmark text datasets. The results show that the proposed FSHSTC is improved the performance of the k-mean clustering algorithm measured by F-measure and Accuracy.
AB - The increasing amount of text information on the Internet web pages affects the clustering analysis. The text clustering is a favorable analysis technique used for partitioning a massive amount of information into clusters. Hence, the major problem that affects the text clustering technique is the presence uninformative and sparse features in text documents. The feature selection (FS) is an important unsupervised technique used to eliminate uninformative features to encourage the text clustering technique. Recently, the meta-heuristic algorithms are successfully applied to solve several optimization problems. In this paper, we proposed the harmony search (HS) algorithm to solve the feature selection problem (FSHSTC). The proposed method is used to enhance the text clustering (TC) technique by obtaining a new subset of informative or useful features. Experiments were applied using four benchmark text datasets. The results show that the proposed FSHSTC is improved the performance of the k-mean clustering algorithm measured by F-measure and Accuracy.
KW - Harmony Search Algorithm
KW - Informative features
KW - K-mean Text Clustering
KW - Sparse features
KW - Unsupervised Feature Selection
UR - https://www.scopus.com/pages/publications/84987617552
U2 - 10.1109/CSIT.2016.7549456
DO - 10.1109/CSIT.2016.7549456
M3 - Conference contribution
AN - SCOPUS:84987617552
T3 - Proceedings - CSIT 2016: 2016 7th International Conference on Computer Science and Information Technology
BT - Proceedings - CSIT 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Conference on Computer Science and Information Technology, CSIT 2016
Y2 - 13 July 2016 through 14 July 2016
ER -