Difference between revisions of "Clustering"

Latest revision as of 00:42, 7 February 2011

This is a technical method discussed in Social Media Analysis 10-802 in Spring 2010.

What problem does it address

Clustering refers to creating multiple group of elements which exhibit similar properties in the attribute values. This group of elements which have comparable value for different attributes for a domain is called a cluster.

Algorithm

Input -

         d : data instances
         a : data attributes
         dFunc : distance function between data instances

Output - c : cluster of data instances

 - Choose some initial seed centroid for clusters
 - Sort all the data instances  d into clusters c in accordance with centroid proximity derived by using dFunc
 - Re-evaluate the centroid
 - Perform the above procedure till some clustering evaluation criteria has been fulfilled

Used in

This technique is widely used practice. e.g clustering of similar documents, summarization etc.

Difference between revisions of "Clustering"

Latest revision as of 00:42, 7 February 2011

Contents

What problem does it address

Algorithm

Used in

Relevant Papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 10: / Line 10: @@
            d : data instances
            a : data attributes
-           dFunc : distance function
+           dFunc : distance function between data instances
 * Output - c : cluster of data instances
+  - Choose some initial seed centroid for clusters
+  - Sort all the data instances  d into clusters c in accordance with centroid proximity derived by using dFunc
+  - Re-evaluate the centroid
+  - Perform the above procedure till some clustering evaluation criteria has been fulfilled
 == Used in ==