ISSN : 2663-2187

Advanced Clustering Techniques: Graph Partitioning and Adaptive Algorithms for Enhanced Document Classification

Main Article Content

Dr. M .Meena Krithika
» doi: 10.33472/AFJBS.6.6.2024.9325-9335

Abstract

Clustering is an automatic learning technique aimed at grouping a set of objects into subsets or clusters. The goal is to create clusters that are coherent internally, but substantially different from each other. In plain words, objects in the same cluster should be as similar as possible, whereas objects in one cluster should be as dissimilar as possible from objects in the other clusters. Automatic Text clustering has played an important role in many fields like information retrieval, data mining, etc. The aim of this thesis is to improve the efficiency and accuracy of document clustering. We discuss two clustering algorithms and the fields where these perform better than the known standard clustering algorithms. The first approach is an improvement of the graph partitioning techniques used for Text clustering. In this we preprocess the graph using a heuristic and then apply the standard graph partitioning algorithms. This improves the quality of clusters to a great extent. The second approach is a completely different approach in which the words are clustered first and then the word cluster is used to cluster the documents. The adaptive adjustment of the damping factor to eliminate oscillations (called adaptive damping), adaptive escaping oscillations, and adaptive searching the space of preference parameter to find out the optimal clustering solution suitable to a data set (called adaptive preference scanning). With these adaptive techniques, adaptive AP will outperform SAP algorithm in clustering quality and oscillation elimination, and it will find optimal clustering solutions. This reduces the noise in data and thus improves the quality of the clusters.

Article Details