半小时搞定！用Python实现手写数字识别，零基础入门机器学习实战

本文链接：https://blog.csdn.net/weixin_46040164/article/details/145906500

为什么选择这个项目？

机器学习听起来高大上，但入门真的需要数学博士学历吗？" 本文将通过一个手写数字识别的实战案例，用不到100行Python代码，带您体验完整的机器学习流程。无论您是：

刚学完Python基础的程序员
想转行AI领域的开发者
对智能识别感兴趣的学生

这个项目都能让您在30分钟内见证机器学习的魔力！

🛠️ 环境准备（3分钟速配）

# 必备库安装（已安装可跳过）
!pip install numpy matplotlib tensorflow scikit-learn

# 环境验证
import tensorflow as tf
print("TensorFlow版本:", tf.__version__)  # 推荐2.x版本

📊 数据加载与探索（MNIST数据集）

为什么选择MNIST？

包含0-9手写数字的6万张训练图+1万测试图
每张28x28像素灰度图（784维特征向量）

from tensorflow.keras.datasets import mnist

# 自动下载数据集（约11MB）
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# 数据可视化
import matplotlib.pyplot as plt

plt.figure(figsize=(10,5))
for i in range(15):
    plt.subplot(3,5,i+1)
    plt.imshow(train_images[i], cmap='gray')
    plt.title(f"Label: {train_labels[i]}")
    plt.axis('off')
plt.tight_layout()
plt.show()

🧠 模型构建（神经网络实战）

数据预处理

# 归一化处理（0-255 -> 0-1）
train_images = train_images.reshape((60000, 28*28)).astype('float32') / 255
test_images = test_images.reshape((10000, 28*28)).astype('float32') / 255

# 标签one-hot编码
from tensorflow.keras.utils import to_categorical
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

搭建神经网络

from tensorflow.keras import models, layers

model = models.Sequential([
    layers.Dense(512, activation='relu', input_shape=(28*28,)),
    layers.Dropout(0.2),
    layers.Dense(256, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
             loss='categorical_crossentropy',
             metrics=['accuracy'])

🚀 模型训练与评估

# 开启训练（GPU加速更佳）
history = model.fit(train_images, train_labels,
                    epochs=10,
                    batch_size=128,
                    validation_split=0.2)

# 测试集评估
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f'\n测试集准确率：{test_acc:.4f}')

典型输出：

Epoch 1/10
375/375 [=======] - 3s 6ms/step - loss: 0.2543 - accuracy: 0.9255
...
测试集准确率：0.9786

损失曲线可视化

plt.plot(history.history['accuracy'], label='训练准确率')
plt.plot(history.history['val_accuracy'], label='验证准确率')
plt.title('模型训练过程')
plt.ylabel('准确率')
plt.xlabel('Epoch')
plt.legend()
plt.show()