def train_decision_tree(training_examples):
root = create_root() # Create a decision tree with a single empty root.
grow_tree(root, training_examples) # Grow the root node.
return root
def grow_tree(node, examples):
condition = find_best_condition(examples) # Find the best condition.
if condition is None:
# No satisfying conditions were found, therefore the grow of the branch stops.
set_leaf_prediction(node, examples)
return
# Create two childrens for the node.
positive_child, negative_child = split_node(node, condition)
# List the training examples used by each children.
negative_examples = [example for example in examples if not condition(example)]
positive_examples = [example for example in examples if condition(example)]
# Continue the growth of the children.
grow_tree(negative_child, negative_examples)
grow_tree(positive_child, positive_examples)
[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["必要な情報がない","missingTheInformationINeed","thumb-down"],["複雑すぎる / 手順が多すぎる","tooComplicatedTooManySteps","thumb-down"],["最新ではない","outOfDate","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["サンプル / コードに問題がある","samplesCodeIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2025-07-27 UTC。"],[[["\u003cp\u003eDecision trees are trained using heuristics due to the complexity of finding the optimal tree structure.\u003c/p\u003e\n"],["\u003cp\u003eA greedy divide-and-conquer strategy recursively grows the tree by selecting the "best" condition at each node.\u003c/p\u003e\n"],["\u003cp\u003eSplitters are algorithms responsible for finding the best condition, optimizing for metrics like information gain or mean squared error depending on the task.\u003c/p\u003e\n"],["\u003cp\u003eThe choice of splitter algorithm depends on factors such as feature type, task, condition type, and regularization criteria.\u003c/p\u003e\n"],["\u003cp\u003eYDF automatically selects splitters based on feature type, hyperparameters, and estimated speed.\u003c/p\u003e\n"]]],[],null,["# Growing decision trees\n\nLike all supervised machine learning models, decision trees are trained to best\nexplain a set of training examples. The optimal training of a decision tree is\nan NP-hard problem. Therefore, training is generally done using heuristics---an\neasy-to-create learning algorithm that gives a non-optimal, but close to\noptimal, decision tree.\n\nMost algorithms used to train decision trees work with a greedy divide and\nconquer strategy. The algorithm starts by creating a single node (the root) and\nrecursively and greedily grows the decision tree.\n\nAt each node, all the possible conditions are evaluated and scored. The\nalgorithm selects the \"best\" condition, that is, the condition with the highest\nscore. For now, just know that the score is a metric that correlates with the\ntask, and conditions are selected to maximize that metric.\n\nFor example, in the [Palmer\nPenguins](https://allisonhorst.github.io/palmerpenguins/articles/intro.html)\ndataset (used for code examples later in this course), most Adelie and Chinstrap\npenguins have a bill's length greater than 16mm, while most of the Gentoo\npenguins have smaller bills. Therefore, the condition\nbill_length_mm ≥ 16 makes almost perfect predictions for the\nGentoo penguins, but can't differentiate\nbetween Adelies and Chinstraps. The algorithm will probably pick this condition.\n\n**Figure 7. The first step in growing a tree.**\n\nThe algorithm then repeats recursively and independently on both children nodes.\nWhen no satisfying conditions are found, the node becomes a leaf. The leaf\nprediction is determined as the most representative label value in the examples.\n\nThe algorithm is as follows:\n**Note:** Unless you are implementing your own decision forest library, you won't need to write this algorithm. \n\n def train_decision_tree(training_examples):\n root = create_root() # Create a decision tree with a single empty root.\n grow_tree(root, training_examples) # Grow the root node.\n return root\n\n def grow_tree(node, examples):\n condition = find_best_condition(examples) # Find the best condition.\n\n if condition is None:\n # No satisfying conditions were found, therefore the grow of the branch stops.\n set_leaf_prediction(node, examples)\n return\n\n # Create two childrens for the node.\n positive_child, negative_child = split_node(node, condition)\n\n # List the training examples used by each children.\n negative_examples = [example for example in examples if not condition(example)]\n positive_examples = [example for example in examples if condition(example)]\n\n # Continue the growth of the children.\n grow_tree(negative_child, negative_examples)\n grow_tree(positive_child, positive_examples)\n\nLet's go through the steps of training a particular decision tree in\nmore detail.\n\n**Step 1:** Create a root:\n\n**Step 2:** Grow node #1. The condition \"x1 ≥ 1\" was found.\nTwo child nodes are created:\n\n**Step 3:** Grow node #2. No satisfying conditions were found. So, make the node\ninto a leaf:\n\n**Step 4:** Grow node #3. The condition \"x2 ≥ 0.5\" was found. Two child\nnodes are created.\n\nOther methods exist to grow decision trees. A popular alternative is to\noptimize nodes globally instead of using a divide and conquer strategy. \nYDF Code\nIn YDF, decision trees are grown with the \"greedy divide and conquer\" algorithm described above by default. Alternatively, you can enable global growth with the `growing_strategy=\"BEST_FIRST_GLOBAL\"` parameter. See [the documentation](https://ydf.readthedocs.io/en/latest/hyperparameters/#growing_strategy) for more details.\n\nDepending on the number and type of input features, the number of\npossible conditions for a given node can be huge, generally infinite.\nFor example, given a threshold condition $\\\\mathrm{feature}_i \\\\geq t$,\nthe combination of all the\npossible\n[threshold values](http://developers.google.com/machine-learning/glossary/#threshold) for\n$t \\\\in \\\\mathbb{R}$ is infinite.\n\nThe routine responsible for finding the best condition is called the\n**splitter**. Because it needs to test a lot of possible conditions, splitters\nare the bottleneck when training a decision tree.\n\nThe score maximized by the splitter depends on the task. For\nexample:\n\n- [Information gain](/machine-learning/glossary#information-gain) and [Gini](/machine-learning/glossary#gini-impurity) (both covered later) are commonly used for classification.\n- Mean squared error is commonly used for regression.\n\nThere are many splitter algorithms, each with varying support for:\n\n- The type of features; for example, numerical, categorical, text\n- The task; for example, binary classification, multi-class classification, regression\n- The type of condition; for example, threshold condition, in-set condition, oblique condition\n- The regularization criteria; for example, exact or approximated splitters for threshold conditions\n\nIn addition, there are equivalent splitter variants with different trade-offs\nregarding memory usage, CPU usage, computation distribution, and so on. \nYDF Code In YDF, splitters are selected implicitly from the automatically detected (or manually specified) type of the feature, the hyperparameters, and the estimated speed of each splitter (during training, YDF runs a small model to predict the speed of each candidate splitter)."]]