Difference between revisions of "Integer Linear Programming"

From Cohen Courses
Jump to navigationJump to search
 
(35 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
== Summary ==
 
== Summary ==
  
Integer Linear Programming is a [[category::method]] for optimizing a linear objective function:
+
Integer Linear Programming (ILP) is a [[category::method]] for optimizing a linear objective function such as:
  
{| cellspacing="10"
+
:: maximize <math> \sum_{i=1}^m{c_i x_i} </math>  
|-
 
| minimize
 
| colspan="2" | <math> \sum_{i=1}^m{c_i x_i} </math>
 
|-
 
| where <math>c_i</math> is known and <math>x_i</math> is unknown variable
 
|-
 
| subject to
 
| <math> \sum_{i=1}^m{a_i x_i} \le b_i</math>
 
|-
 
| where <math>a_i</math> and <math>b_i</math> are known
 
|-
 
| and
 
| <math>x_i \in \{0,1\}</math>
 
|}
 
  
subject to linear equality or inequality constraints.
+
where <math>c_i\,\!</math>'s are known and <math>x_i\,\!</math>'s are unknown, subject to linear equality or inequality constraints such as:
  
 +
:: <math> \sum_{i=1}^m{a_i x_i} \le b_i</math>
  
Bagging (a.k.a '''b'''ootstrap '''agg'''regat'''ing''') is an ensemble machine learning [[category::method]] introduced by Leo Brieman for classification and regression, which generates multiple versions of a predictor, by making bootstrap replications of the learning set and using them as the new learning set, and uses them to produce an aggregated predictor, which does voting over the different versions for classification and averages outcomes when predicting numerical values. Brieman showed that bagging can best improve accuracy when the predictors are good but unstable (when perturbing the learning set results in significant changes in the predictors).
+
where <math>a_i\,\!</math>'s and <math>b_i\,\!</math>'s are known, and where <math>x_i\,\!</math>'s can only take integer values
 +
 
 +
In other words, it is a method to find the optimal solution (i.e. the best assignment of unknown variables such as <math>x_i\,\!</math>) that maximizes the objective function while meeting a list of requirements expressed as linear equality or inequality relationships.
 +
 
 +
The strength of ILP is in its joint inference. Instead of making local, isolated assignment of each <math>x_i\,\!</math>, it makes joint assignments of all <math>x_i\,\!</math>'s at the same time; respecting the global constraints while optimizing the objective function given.
 +
 
 +
ILP is known to be NP-hard. However, there are many off-the-shelf solvers, both commercial and non commercial, that are available. One such solver is [http://scip.zib.de/ SCIP], which is currently the fastest non commercial mixed integer programming solver.
  
 
== Procedure ==
 
== Procedure ==
'''Input/Definitions''':  
+
'''Input''':  
* ''D'' - Original training set
+
* The linear objective function
* ''D_i'' - One of the bootstrap sample training sets
+
* The linear constraints
* ''n'' - Size of ''D''
+
 
* ''m'' - Number of predictors to construct
+
'''Output''':
* ''M'' - Set of trained models
+
* The assignment of unknown variables that optimizes the objective function and is consistent with the constraints
* ''M_i'' - One of the trained models
+
 
 +
== Related Methods ==
  
'''Training''':
+
ILP method to do joint inference is related to other methods such as [[Markov_Logic_Networks|Markov Logic Networks]], a combination of first-order logic and Markov networks, which also incorporates global learning and inference to improve local classifiers' decisions. Global constraints are enforced through the addition of weighted first order logic formulae. The difference with ILP is that Markov Logic Network allows for non-deterministic (soft) constraints. Constraints are assigned different weights during learning phase, thus allowing for constraints that tend to hold but do not always have to. ILP on the other hand enforces hard constraints. It is possible to incorporate weighted constraints into ILPs, however it is not obvious how the weights should be learnt.
* Generate ''m'' new training sets, ''D_i'', of size ''n_prime'' < ''n''.
 
** Do so by sampling from ''D'' uniformly and with replacement (a.k.a. a bootstrap sample)
 
* Train ''m'' models, ''M_i'', using bootstrap sample ''D_i''
 
  
'''Output Prediction''':
+
Another related method to do joint inference is a [http://www.cs.cmu.edu/~nlao/publication/2011/2011.emnlp.nell.pdf version] of Path Ranking Algorithm that conducts soft inference based on a combination of constrained, weighted, random walks through a network of local decisions and their relations.
* For Regression: Average outcome of the predictors in ''M''
 
* For Classification: Vote of the predictors in ''M''
 
  
 
== References / Links ==
 
== References / Links ==
* Leo Brieman. '''Bagging Predictors'''. Machine Learning, 24, 123–140 (1996). - [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.121.7654&rep=rep1&type=pdf]
+
* Nemhauser, G.L. and Wolsey, L.A. ''Integer and combinatorial optimization'', Volume 18, Wiley New York, 1988. - [http://www.ulb.tu-darmstadt.de/tocs/103803378.pdf]
* Wikipedia article on Bagging - [http://en.wikipedia.org/wiki/Bootstrap_aggregating]
+
* Wikipedia article on Integer Programming - [http://en.wikipedia.org/wiki/Integer_programming]
  
 
== Relevant Papers ==
 
== Relevant Papers ==
Line 52: Line 41:
 
| ?UsesDataset
 
| ?UsesDataset
 
}}
 
}}
 +
 +
== Comments ==
 +
 +
I feel like ILP software packages are the point of formulating your problem as an ILP, since you're hoping that highly optimized ILP solvers will be better than whatever heuristic search algorithm you can cook up yourself.  I mean, they're doing heuristic search too (it's all NP hard so it's all heuristics), but they might be doing it a lot better.  My labmate [http://www.cs.cmu.edu/~dipanjan/ Dipanjan] found that when he moved a component of his semantic role labeler from a beam search to ILP, it went 100 times faster (!).  This was with CPLEX, a commercial solver (that CMU has a license for); I get the impression it's pretty popular?
 +
 +
Speculation: It's worth considering that operations research people have been working on ILP for decades, and giant corporations and government organizations use these systems to plan logistics and operations, so every 1% performance improvement is millions of dollars, so if you're using CPLEX you get all those benefits.
 +
 +
Tangential related paper: Geoff Gordon, Sue Ann Hong, and Miro Dudik have a cool paper on adding first-order logic to mixed ILP's.  http://www.cs.cmu.edu/~ggordon/gordon-hong-dudik-fop.pdf  There's some sort of relationship to MLN's... like what you said, I think, but furthermore first-order ILP's can do quantifiers and predicates just like MLN's.
 +
 +
--[[User:Brendan|Brendan]] 23:06, 13 October 2011 (UTC)

Latest revision as of 10:43, 14 October 2011

Summary

Integer Linear Programming (ILP) is a method for optimizing a linear objective function such as:

maximize

where 's are known and 's are unknown, subject to linear equality or inequality constraints such as:

where 's and 's are known, and where 's can only take integer values

In other words, it is a method to find the optimal solution (i.e. the best assignment of unknown variables such as ) that maximizes the objective function while meeting a list of requirements expressed as linear equality or inequality relationships.

The strength of ILP is in its joint inference. Instead of making local, isolated assignment of each , it makes joint assignments of all 's at the same time; respecting the global constraints while optimizing the objective function given.

ILP is known to be NP-hard. However, there are many off-the-shelf solvers, both commercial and non commercial, that are available. One such solver is SCIP, which is currently the fastest non commercial mixed integer programming solver.

Procedure

Input:

  • The linear objective function
  • The linear constraints

Output:

  • The assignment of unknown variables that optimizes the objective function and is consistent with the constraints

Related Methods

ILP method to do joint inference is related to other methods such as Markov Logic Networks, a combination of first-order logic and Markov networks, which also incorporates global learning and inference to improve local classifiers' decisions. Global constraints are enforced through the addition of weighted first order logic formulae. The difference with ILP is that Markov Logic Network allows for non-deterministic (soft) constraints. Constraints are assigned different weights during learning phase, thus allowing for constraints that tend to hold but do not always have to. ILP on the other hand enforces hard constraints. It is possible to incorporate weighted constraints into ILPs, however it is not obvious how the weights should be learnt.

Another related method to do joint inference is a version of Path Ranking Algorithm that conducts soft inference based on a combination of constrained, weighted, random walks through a network of local decisions and their relations.

References / Links

  • Nemhauser, G.L. and Wolsey, L.A. Integer and combinatorial optimization, Volume 18, Wiley New York, 1988. - [1]
  • Wikipedia article on Integer Programming - [2]

Relevant Papers

Comments

I feel like ILP software packages are the point of formulating your problem as an ILP, since you're hoping that highly optimized ILP solvers will be better than whatever heuristic search algorithm you can cook up yourself. I mean, they're doing heuristic search too (it's all NP hard so it's all heuristics), but they might be doing it a lot better. My labmate Dipanjan found that when he moved a component of his semantic role labeler from a beam search to ILP, it went 100 times faster (!). This was with CPLEX, a commercial solver (that CMU has a license for); I get the impression it's pretty popular?

Speculation: It's worth considering that operations research people have been working on ILP for decades, and giant corporations and government organizations use these systems to plan logistics and operations, so every 1% performance improvement is millions of dollars, so if you're using CPLEX you get all those benefits.

Tangential related paper: Geoff Gordon, Sue Ann Hong, and Miro Dudik have a cool paper on adding first-order logic to mixed ILP's. http://www.cs.cmu.edu/~ggordon/gordon-hong-dudik-fop.pdf There's some sort of relationship to MLN's... like what you said, I think, but furthermore first-order ILP's can do quantifiers and predicates just like MLN's.

--Brendan 23:06, 13 October 2011 (UTC)