stander random forest: random K features, enum all values as split, find best split.
LINKS:https://en.wikipedia.org/wiki/Random_forest
Extremely randomized trees: random K features, random a split value, find best split.
ensemble Extremely randomized trees: use all data.
LINKS:http://docs.opencv.org/2.4/modules/ml/doc/ertrees.html
- Extremely randomized trees don’t apply the bagging procedure to construct a set of the training samples for each tree. The same input training set is used to train all trees.
- Extremely randomized trees pick a node split very extremely (both a variable index and variable splitting value are chosen randomly), whereas Random Forest finds the best split (optimal one by variable index and variable splitting value) among random subset of variables.
Extremely randomized trees用了所有的樣本作為訓練集;Extremely randomized trees隨機選一個特征和一個值作為分割標準;
LINKS:http://scikit-learn.org/stable/modules/generated/sklearn.tree.ExtraTreeRegressor.html#sklearn.tree.ExtraTreeRegressor
This class implements a meta estimator that fits a number of randomized decision trees (a.k.a. extra-trees) on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting.
Extra-trees differ from classic decision trees in the way they are built. When looking for the best split to separate the samples of a node into two groups, random splits are drawn for each of the max_features randomly selected features and the best split among those is chosen. When max_features is set 1, this amounts to building a totally random decision tree.
extra-trees 的ensemble用了bagging,然后選取多個特征,每個特征隨機選一個值作為分割標準建樹。
一種實現方法:
樣本bagging, random n features & random k values ,求最優,建樹。