TY - GEN
T1 - Feature Selection with β-Hill Climbing Search for Text Clustering Application
AU - Abualigah, Laith Mohammad
AU - Khader, Ahamad Tajudin
AU - Al-Betar, Mohammed Azmi
AU - Alyasseri, Zaid Abdi Alkareem
AU - Alomari, Osama Ahmad
AU - Hanandeh, Essam Said
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/9/14
Y1 - 2017/9/14
N2 - In the bases of increasing the volume of text information, the dealing with text information has become incredibly complicated. The text clustering is a suitable technique used in dealing with a tremendous amount of text documents by classifying these set of text documents into clusters. Ultimately, text documents hold sparse, non-uniform distribution and uninformative features are difficult to cluster. The text feature selection is a primary unsupervised learning method that is utilized to choose a new subset of informational text features. In this paper, a new algorithm is proposed based on β-hill climbing technique for text feature selection problem to improve the text clustering (B-FSTC). The results of the proposed method for β-hill climbing and original Hill climbing (i.e., H-FSTC) are examined using the k-mean text clustering and compared with each other. Experiments were conducted on four standard text datasets with varying characteristics. Interestingly, the proposed β-hill climbing algorithm obtains superior results in comparison with the other well-regard techniques by producing a new subset of informational text features. Lastly, the β-hill climbing-based feature selection method supports the k-mean clustering algorithm to achieve more precise clusters.
AB - In the bases of increasing the volume of text information, the dealing with text information has become incredibly complicated. The text clustering is a suitable technique used in dealing with a tremendous amount of text documents by classifying these set of text documents into clusters. Ultimately, text documents hold sparse, non-uniform distribution and uninformative features are difficult to cluster. The text feature selection is a primary unsupervised learning method that is utilized to choose a new subset of informational text features. In this paper, a new algorithm is proposed based on β-hill climbing technique for text feature selection problem to improve the text clustering (B-FSTC). The results of the proposed method for β-hill climbing and original Hill climbing (i.e., H-FSTC) are examined using the k-mean text clustering and compared with each other. Experiments were conducted on four standard text datasets with varying characteristics. Interestingly, the proposed β-hill climbing algorithm obtains superior results in comparison with the other well-regard techniques by producing a new subset of informational text features. Lastly, the β-hill climbing-based feature selection method supports the k-mean clustering algorithm to achieve more precise clusters.
KW - Informative features
KW - K-mean Text document Clustering
KW - Unsupervised Feature Selection
KW - β-Hill Climbing
UR - https://www.scopus.com/pages/publications/85032278369
U2 - 10.1109/PICICT.2017.30
DO - 10.1109/PICICT.2017.30
M3 - Conference contribution
AN - SCOPUS:85032278369
T3 - Proceedings - 2017 Palestinian International Conference on Information and Communication Technology, PICICT 2017
SP - 22
EP - 27
BT - Proceedings - 2017 Palestinian International Conference on Information and Communication Technology, PICICT 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd Palestinian International Conference on Information and Communication Technology, PICICT 2017
Y2 - 8 May 2017 through 9 May 2017
ER -