Difference between revisions of "Cosine similarity"

From Cohen Courses
Jump to navigationJump to search
(Created page with 'Refers to measuring the angular distance (cosine) between two vectors. Cosine of two vectors can be easily derived by using the [[Euclidean vector#Dot product|Euclidean Dot Prod…')
(No difference)

Revision as of 23:46, 6 February 2011

Refers to measuring the angular distance (cosine) between two vectors. Cosine of two vectors can be easily derived by using the Euclidean Dot Product formula:

Given two vectors of attributes, A and B, the cosine similarity, θ, is represented using a dot product and magnitude as

In text domains, a document is generally treated as a bag of words where each unique word in the vocabulary is a dimension of the vector. Thus similarity between two documents can be assessed by finding the cosine similarity between the vectors corresponding to these two documents. Each element of vector A and vector B is generally taken to be tf-idf weight.