Today I learnt about the different clustering methods discussed in class.
K-Means, a partitioning method, groups data points into K clusters based on their proximity to cluster centers, typically the means of the data points in each cluster. This approach seeks to minimize the sum of squared distances and is widely used in practice.
K-Medoids, on the other hand, shares similarities with K-Means but employs the medoid, the data point most centrally located within a cluster, as the representative of each cluster. This method is preferred in scenarios where robustness to outliers is crucial.
DBSCAN is a density-based approach that identifies clusters as dense regions separated by areas of lower density, making it particularly suitable for datasets with irregularly shaped clusters and noise. It employs two essential parameters, an epsilon distance threshold and a minimum number of data points required to define a dense region. DBSCAN, known for its ability to automatically discover the number of clusters, is less sensitive to initial configurations and can adapt to various data distributions.