Difference between revisions of "Gradient Boosted Decision Tree"

Revision as of 12:00, 29 March 2011

GBDT is an additive regression algorithm consisting of an ensemble of trees, fitted to current residuals, gradients of the loss function, in a forward step-wise manner. It iteratively fits an additive model as

$f_{t}(x)=T_{t}(x;\Theta )+\lambda {\overset {T}{\underset {t=1}{\sum }}}\beta _{t}T_{t}(x;\Theta _{t})$

such that a certain loss function $L(y_{i},f_{T}(x+i))$ is minimized, where $T_{t}(x;\Theta _{t})$ is a tree at iteration $t$ , weighted by parameter $\sigma _{t}$ , with a finite number of parameters, $\Theta _{t}$ and $\lambda$ is the learning rate. At iteration $t$ , tree $T_{t}(x;\sigma )$ is induced to fit the negative gradient by least squares. That is

${\hat {\Theta }}:=\operatorname {argmin} _{\beta }{\overset {N}{\underset {i}{\sum }}}(-G_{it}-\beta _{t}T_{t}(x_{i});\Theta )^{2}$

where $G_{it}$ is the gradient over current prediction function

$G_{it}=[{\frac {\delta L(y_{i},f(x_{i}))}{\delta f(x_{i})}}]_{f=f_{t}-1}$

The optimal weights of trees $\beta _{t}$ are determined by

$\beta _{t}=\operatorname {argmin} _{\beta }{\overset {N}{\underset {i}{\sum }}}L(y_{i},f_{t-1}(x_{i})+\beta T(x_{i},\theta ))$

(Source: [Dong et al WWW 2010])

@@ Line 13: / Line 13: @@
 The optimal weights of trees <math>\beta_{t}</math> are determined by
-<math><\beta_{t}=\operatorname{argmin}_{\beta}\overset{N}{\underset{i}{\sum}}L(y_{i},f_{t-1}(x_{i})+\beta T(x_{i},\theta))/math>
+<math>\beta_{t}=\operatorname{argmin}_{\beta}\overset{N}{\underset{i}{\sum}}L(y_{i},f_{t-1}(x_{i})+\beta T(x_{i},\theta))</math>
 (Source: [Dong et al WWW 2010])

Difference between revisions of "Gradient Boosted Decision Tree"

Revision as of 12:00, 29 March 2011

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools