Difference between revisions of "Chun-Nam John Yu, Hofmann , Learning structural SVMs with latent variables 2009"

@@ Line 1: / Line 1: @@
-==Citation==
-Chun-Nam John Yu and Thorsten Joachims. Learning structural SVMs with latent variables. In
-Proceedings of the 26th International Conference on Machine Learning,Montréal, Québec, Canada,
-.
-== Online version ==
-[http://www.cs.cornell.edu/~cnyu/papers/icml09_latentssvm.pdf]
-== Summary ==
-In this [[Category::paper]] author talks about the use of latent variable in the structural SVM.
-The paper also identifies the formulation for which their exists effecient algorithm to find the local optimum using convex-concave optimization techniques.
-The paper argues that this is the first time latent variable are being used in large margin classifiers.Experiments were then performed in various domains
-of computational Biology, IR and NLP to prove the generality of the proposed method.
-== Method Used ==
-This paper extends the formulation of Structured SVM given by Tsochantaridis to
-include a latent variable in it.
-Consider set of Structed input out put pairs S
-Let
-<math> S = {(x1,y1),.......(xn,yn)}  \epsilon  (X x Y)^n </math>.
-The prediction rule will be
-<math> f_w(x) = argmax_{y \epsilon Y} [w.G(x,y)] </math>
-where G is the joint feature vector that describes the relation between input and output.This paper introduces an extra latent variable h
-so now the prediction rule changes to
-<math> f_w(x) = argmax_{(y,h) \epsilon YxH} Y[w.G(h,x,y)] </math>
-Similary extending the loss function <math> \triangle </math> to include latent variable
-	<math> \triangle((y_i,h_i^*(w)), (y_i^opt(w), hi^opt (w))) </math>
-		where
-             <math> h_i^*(w) = argmax_{h_ /epsilon H}w.G(x_i,y_i,h)</math>
-	    <math>(y_i^opt(w), hi^optopt(w)) = argmax_{(y,h) /epsilon YxH}w.G(x_i,y,h) </math>
-	Loss function is the difference between the pair given by prediction rule and the latent variable <math> h_i^* </math> which explains the <math> (x_i, y_i) </math>
-	Like in the case of structural svm we can derive the upper bound of this function to be
-==
-Let <math> S = {(x_1,y_1),.......(x_n,y_n)}\epsilon(X x Y)^n </math>.
-The prediction rule will be
- <math> f_w(x) = argmax_{y \epsilon Y} [w.G(x,y)] </math>
- where G is the joint feature vector that describes the relation between input and output.This paper introduces an extra latent variable h
- so now the prediction rule changes to
- <math> f_w(x) = argmax_{(y,h) \epsilon YxH} Y[w.G(h,x,y)] </math>
- ==

Difference between revisions of "Chun-Nam John Yu, Hofmann , Learning structural SVMs with latent variables 2009"

Latest revision as of 02:37, 1 October 2011

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools