Pages

Tuesday, September 16, 2014

Classification and Regression Trees : rpart

How to prune a decision tree ?
Prune back the tree to avoid overfitting the data. Typically, you will want to select a tree size that minimizes the cross-validated error, the xerror column printed by printcp( ).
Prune the tree to the desired size using
prune(fitcp= )
Specifically, use printcp( ) to examine the cross-validated error results, select the complexity parameter associated with minimum error, and place it into the prune( ) function. Alternatively, you can use the code fragment
     fit$cptable[which.min(fit$cptable[,"xerror"]),"CP"]
to automatically select the complexity parameter associated with the smallest cross-validated error. Thanks to HSAUR for this idea.
http://www.mayo.edu/hsr/techrpt/61.pdf
http://www.statmethods.net/advstats/cart.html - classification, regression trees, random forests

No comments:

Post a Comment