Davies Bouldin Index

★★★★★

0 votes

Easy, machine learning, Easy

Details

Problem

Clustering is an important class of unsupervised learning. Clustering algorithms cluster(or group) data points based on the similarity between them. This grouping is different from classification as unlike classification, groups are not predefined. There are different types of clustering algorithms - Exclusive Clustering, Overlapping Clustering, Hierarchical Clustering, Probabilistic Clustering, etc.

Unlike classification, in which measuring quality of classification is very straightforward, quantifying clustering quality requires setting up similarity measures whose computation can sometimes become too complex! Davies Bouldin Index is one such measure of computing the quality of clustering that has been performed.

The Davies Bouldin Index has to be calculated for any value of n_clusters (nc) as follows:

db_formula

db_formula(continued)

Where,

symbol explanation

(Davies Bouldin Index: https://en.wikipedia.org/wiki/Davies%E2%80%93Bouldin_index)

(K-means clustering using scikit-learn: http://scikit-learn.org/stable/tutorial/statistical_inference/unsupervised_learning.html)

Input format

Line 1: M values

Line 2: N features

Line 3: n_clusters

Next M lines contain N space separated floating point values.

Line 4 to M+3: v1 v2 ... vN

Output format

Output a single integer, the Davies Bouldin Index for the given Input when k-means clustering algorithm is applied to it with given number of cluster centers.

Apart from the output, you will be judged based on your understanding of the concept and it's implementation.

NOTE: You are allowed to use external libraries for clustering, however, use of libraries to calculate the index will lead to disqualification from the contest.

Time Limit: 10

Memory Limit: 256

Source Limit: