ISSN : 2663-2187

A Topical Reflection Measure Based Classification Model for Improved Document Clustering

Main Article Content

Dr.Abhijeet Madhukar Haval, Akanksha Mishra
ยป doi: 10.33472/AFJBS.6.Si2.2024.2998-3006

Abstract

The problem of document clustering has been approached with different techniques in literature. The popular k means clustering algorithm uses Euclidean distance as the similarity measure in clustering the documents. Similarly, different approaches use various similarity measures like Euclidean distance, term frequency in measuring the similarity among the documents towards document clustering. However, they suffer with poor clustering accuracy and overlap. To solve this, an efficient Topical Reflection Measure based classification model (TRMCM) is presented in this article. The proposed TRMCM model preprocesses the document text to obtain meaningful terms and applies feature selection to identify subset of terms. The selected features are used to measure TRM value at the classification phase. Based on the value of TRM, the method identifies the class of document to perform clustering. The proposed TRMCM model improves clustering accuracy with less overlap

Article Details