10-601 PAC

From Cohen Courses
Revision as of 18:01, 7 October 2013 by Pxie1 (talk | contribs) (Created page with "1) reading: the PAC learning, VC dimension, etc. and relevant chapters from Tom Mitchell 's (I forgot which chapter, please find out) 2) an example of PAC learnability of Boo...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

1) reading: the PAC learning, VC dimension, etc. and relevant chapters from Tom Mitchell 's (I forgot which chapter, please find out)

2) an example of PAC learnability of Boolean literals (from http://www.cis.temple.edu/~giorgio/cis587/readings/pac.html)


Learning a Boolean Conjunction

We can efficiently PAC learn concepts that are represented as the conjunction of boolean literals (i.e. positive or negative boolean variables). Here is a learning algorithm: Start with an hypothesis h which is the conjunction of each variable and its negation x1 & ~x1 & x2 & ~x2 & .. & xn & ~xn. Do nothing with negative instances. On a positive instance a, eliminate in h ~xi if ai is positive, eliminate xi if ai is negative. For example if a positive instance is 01100 then eliminate x1, ~x2, ~x3, x4, and x5. In this algorithm h, as domain, is non-decreasing and at all times contained in the set denoted by c. [By induction: certainly true initially, ..]. We will have an error when h contains a literal z which is not in c. We compute first the probability that a literal z is deleted from h because of one specific positive example. Clearly this probability is 0 if z occurs in c, and if ~z is in c the probability is 1. At issue are the z where neither z nor ~z is in c. We would like to eliminate both of them from h. If one of the two remains, we have an error for an instance a that is positive for c and negative for h. Let's call these literals, free literals. We have:

error(h) is less than or equal to the Sum of the probabilities of the free literals z in h not to be eliminated by one positive example.

Since there are at most 2*n literals in h, if h is a bad hypothesis, i.e. an hypothesis with error greater than epsilon, we will have Probability[free literal z is eliminated from h by one positive example] > epsilon/(2*n)

From this we obtain : Probability[free literal z survives one positive example] = 1 - Probability[free literal z is eliminated from h by one positive example] < (1 - epsilon/(2*n))

Probability[free literal z survives m positive examples] < (1 - epsilon/(2*n))^m

Probability[some free literal z survives m positive examples] < 2n*(1 - epsilon/(2*n))^m < 2n*(e^(-epsilon/(2*n)))^m = 2n*e^(-(m*epsilon)/(2*n))

That is m > (2*n/epsilon)*(ln(1/delta)+n*ln(2))


3) what you should remember: lRelationships between sample complexity, error bound, and “capacity” of chosen hypothesis space l lWithin the PAC learning setting, we can bound the probability that learner will output hypothesis with given error lFor ANY consistent learner (case where c in H) lFor ANY “best fit” hypothesis (agnostic learning, where perhaps c not in H) lVC dimension as measure of complexity of H