GBDT is an additive regression algorithm consisting of an ensemble of trees, fitted to current residuals, gradients of the loss function, in a forward step-wise manner. It iteratively fits an additive model as
such that a certain loss function
is minimized, where
is a tree at iteration
, weighted by parameter
, with a finite number of parameters,
and
is the learning rate. At iteration
, tree
is induced to fit the negative gradient by least squares. That is
where
is the gradient over current prediction function
The optimal weights of trees
are determined by
Source: Dong et al WWW 2010