Sklearn 机器学习房价预估使用GBDT训练模型-CSDN博客

💖亲爱的技术爱好者们，热烈欢迎来到 Kant2048 的博客！我是 Thomas Kant，很开心能在CSDN上与你们相遇～💖

在这里插入图片描述

Sklearn 机器学习房价预估 —— 使用 GBDT 训练模型

在机器学习应用场景中，房价预测 是一个经典问题。

本文将介绍如何使用 Sklearn 的 GBDT（Gradient Boosting Decision Tree, 梯度提升决策树） 来构建并训练模型，并与 线性回归、随机森林 进行对比，帮助读者更好地理解 GBDT 的优势。

📌 一、数据集准备

我们使用 波士顿房价数据集（Boston Housing Dataset）。这是一个经典的回归数据集，目标是预测房屋的中位数价格。

from sklearn.datasets import load_boston
import pandas as pd

# 加载数据集
boston = load_boston()
X = pd.DataFrame(boston.data, columns=boston.feature_names)
y = boston.target

print(X.head())
print(y[:10])

⚠️ 注意：load_boston 在新版本的 sklearn 已被弃用。建议在生产环境中使用 加州房价数据集（fetch_california_housing） 作为替代。

📊 二、数据集拆分与预处理

在训练之前，我们需要拆分数据集，并进行标准化处理。

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# 数据集划分
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 特征标准化
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)