Wednesday, 10 June 2015

Difference between classification and clustering

Classification– The task of assigning instances to pre-defined classes.
–E.g. Deciding whether a particular patient record can be associated with a specific disease.

Classification is supervised learning technique used to assign per-defined tag to instance on the basis of features. So classification algorithm requires training data. Classification model is created from training data, then classification  model is used to classify new instances.   

Clustering – The task of grouping related data points together without labeling them. 
–E.g. Grouping patient records with similar symptoms without knowing what the symptoms indicate.

Clustering is unsupervised technique used to group similar instances on the basis of features. Clustering does not require training data. Clustering does not assign per-defined label to each and every group.

No comments:

Post a Comment