使用TensorFlow2.0实现卷积神经网络CNN对MNIST数字分类 (tensorflow2.0官方教程翻译)

本教程演示了如何训练简单的卷积神经网络（CNN）来对MNIST数字进行分类。这个简单的网络将在MNIST测试集上实现99％以上的准确率。因为本教程使用Keras Sequential API，所以创建和训练我们的模型只需几行代码。

注意：CNN使用GPU训练更快。

1. 导入TensorFlow


x
1
from __future__ import absolute_import, division, print_function, unicode_literals
2
3
import tensorflow as tf
4
5
from tensorflow.keras import datasets, layers, models

2. 下载预处理MNIST数据集


xxxxxxxxxx
7
1
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
2
3
train_images = train_images.reshape((60000, 28, 28, 1))
4
test_images = test_images.reshape((10000, 28, 28, 1))
5
6
# 特征缩放[0, 1]区间 
7
train_images, test_images = train_images / 255.0, test_images / 255.0

3. 创建卷积基

下面6行代码使用常见模式定义卷积基数： Conv2D 和MaxPooling2D层的堆栈。

作为输入，CNN采用形状的张量（image_height, image_width, color_channels），忽略批量大小。MNIST有一个颜色通道（因为图像是灰度的），而彩色图像有三个颜色通道（R,G,B）。在此示例中，我们将配置CNN以处理形状（28,28,1）的输入，这是MNIST图像的格式，我们通过将参数input_shape传递给第一层来完成此操作。


xxxxxxxxxx
7
1
model = models.Sequential()
2
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
3
model.add(layers.MaxPooling2D((2, 2)))
4
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
5
model.add(layers.MaxPooling2D((2, 2)))
6
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
7
model.summary() # 显示模型的架构


xxxxxxxxxx
15
1
Model: "sequential"
2
_________________________________________________________________
3
Layer (type)                 Output Shape              Param #   
4
=================================================================
5
conv2d (Conv2D)              (None, 26, 26, 32)        320       
6
_________________________________________________________________
7
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
8
_________________________________________________________________
9
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
10
_________________________________________________________________
11
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
12
_________________________________________________________________
13
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
14
=================================================================
15
...

在上面，你可以看到每个Conv2D和MaxPooling2D层的输出都是3D张量的形状（高度，宽度，通道），随着我们在网络中更深入，宽度和高度大小趋于缩小，每个Conv2D层的输出通道的数由第一个参数（例如，32或64）控制。通常，随着宽度和高度的缩小，我们可以（计算地）在每个Conv2D层中添加更多输出通道

4. 在顶部添加密集层

为了完成我们的模型，我们将最后的输出张量从卷积基（形状(3,3,64)）馈送到一个或多个密集层中以执行分类。密集层将矢量作为输入（1D），而当前输出是3D张量。首先，我们将3D输出展平（或展开）为1D，然后在顶部添加一个或多个Dense层。MINST有10个输出类，因此我们使用具有10输出和softmax激活的最终Dense层。


xxxxxxxxxx
4
1
model.add(layers.Flatten())
2
model.add(layers.Dense(64, activation='relu'))
3
model.add(layers.Dense(10, activation='softmax'))
4
model.summary() # 显示模型的架构


xxxxxxxxxx
21
1
Model: "sequential"
2
_________________________________________________________________
3
Layer (type)                 Output Shape              Param #   
4
=================================================================
5
conv2d (Conv2D)              (None, 26, 26, 32)        320       
6
_________________________________________________________________
7
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
8
_________________________________________________________________
9
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
10
_________________________________________________________________
11
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
12
_________________________________________________________________
13
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
14
_________________________________________________________________
15
flatten (Flatten)            (None, 576)               0         
16
_________________________________________________________________
17
dense (Dense)                (None, 64)                36928     
18
_________________________________________________________________
19
dense_1 (Dense)              (None, 10)                650       
20
=================================================================
21
...

从上面可以看出，在通过两个密集层之前，我们的(3,3,64)输出被展平为矢量（576）。

5. 编译和训练模型


xxxxxxxxxx
5
1
model.compile(optimizer='adam',
2
              loss='sparse_categorical_crossentropy',
3
              metrics=['accuracy'])
4
5
model.fit(train_images, train_labels, epochs=5)


xxxxxxxxxx
3
1
...
2
Epoch 5/5
3
60000/60000 [==============================] - 15s 258us/sample - loss: 0.0190 - accuracy: 0.9941

6. 评估模型


xxxxxxxxxx
1
1
test_loss, test_acc = model.evaluate(test_images, test_labels)


xxxxxxxxxx
1
1
10000/10000 [==============================] - 1s 92us/sample - loss: 0.0272 - accuracy: 0.9921


xxxxxxxxxx
1
1
print(test_acc)


xxxxxxxxxx
1
1
    0.9921

如你所见，我们简单的CNN已经达到了超过99%的测试精度，这几行代码还不错。另一种编写CNN的方式here（使用Keras Subclassing API和GradientTape）。

最新版本：https://www.mashangxue123.com/tensorflow/tf2-tutorials-images-intro_to_cnns.html
英文版本：https://tensorflow.google.cn/beta/tutorials/images/save_and_restore_models
翻译建议PR：https://github.com/mashangxue/tensorflow2-zh/edit/master/r2/tutorials/images/intro_to_cnns.md