本教程演示了如何训练简单的卷积神经网络(CNN)来对MNIST数字进行分类。这个简单的网络将在MNIST测试集上实现99%以上的准确率。因为本教程使用Keras Sequential API,所以创建和训练我们的模型只需几行代码。
注意:CNN使用GPU训练更快。
x1from __future__ import absolute_import, division, print_function, unicode_literals
2
3import tensorflow as tf
4
5from tensorflow.keras import datasets, layers, models
xxxxxxxxxx
71(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
2
3train_images = train_images.reshape((60000, 28, 28, 1))
4test_images = test_images.reshape((10000, 28, 28, 1))
5
6# 特征缩放[0, 1]区间
7train_images, test_images = train_images / 255.0, test_images / 255.0
下面6行代码使用常见模式定义卷积基数: Conv2D 和MaxPooling2D层的堆栈。
作为输入,CNN采用形状的张量(image_height, image_width, color_channels),忽略批量大小。MNIST有一个颜色通道(因为图像是灰度的),而彩色图像有三个颜色通道(R,G,B)。在此示例中,我们将配置CNN以处理形状(28,28,1)的输入,这是MNIST图像的格式,我们通过将参数input_shape传递给第一层来完成此操作。
xxxxxxxxxx
71model = models.Sequential()
2model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
3model.add(layers.MaxPooling2D((2, 2)))
4model.add(layers.Conv2D(64, (3, 3), activation='relu'))
5model.add(layers.MaxPooling2D((2, 2)))
6model.add(layers.Conv2D(64, (3, 3), activation='relu'))
7model.summary() # 显示模型的架构
xxxxxxxxxx
151Model: "sequential"
2_________________________________________________________________
3Layer (type) Output Shape Param #
4=================================================================
5conv2d (Conv2D) (None, 26, 26, 32) 320
6_________________________________________________________________
7max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0
8_________________________________________________________________
9conv2d_1 (Conv2D) (None, 11, 11, 64) 18496
10_________________________________________________________________
11max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0
12_________________________________________________________________
13conv2d_2 (Conv2D) (None, 3, 3, 64) 36928
14=================================================================
15...
在上面,你可以看到每个Conv2D和MaxPooling2D层的输出都是3D张量的形状(高度,宽度,通道),随着我们在网络中更深入,宽度和高度大小趋于缩小,每个Conv2D层的输出通道的数由第一个参数(例如,32或64)控制。通常,随着宽度和高度的缩小,我们可以(计算地)在每个Conv2D层中添加更多输出通道
为了完成我们的模型,我们将最后的输出张量从卷积基(形状(3,3,64))馈送到一个或多个密集层中以执行分类。密集层将矢量作为输入(1D),而当前输出是3D张量。首先,我们将3D输出展平(或展开)为1D,然后在顶部添加一个或多个Dense层。MINST有10个输出类,因此我们使用具有10输出和softmax激活的最终Dense层。
xxxxxxxxxx
41model.add(layers.Flatten())
2model.add(layers.Dense(64, activation='relu'))
3model.add(layers.Dense(10, activation='softmax'))
4model.summary() # 显示模型的架构
xxxxxxxxxx
211Model: "sequential"
2_________________________________________________________________
3Layer (type) Output Shape Param #
4=================================================================
5conv2d (Conv2D) (None, 26, 26, 32) 320
6_________________________________________________________________
7max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0
8_________________________________________________________________
9conv2d_1 (Conv2D) (None, 11, 11, 64) 18496
10_________________________________________________________________
11max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0
12_________________________________________________________________
13conv2d_2 (Conv2D) (None, 3, 3, 64) 36928
14_________________________________________________________________
15flatten (Flatten) (None, 576) 0
16_________________________________________________________________
17dense (Dense) (None, 64) 36928
18_________________________________________________________________
19dense_1 (Dense) (None, 10) 650
20=================================================================
21...
从上面可以看出,在通过两个密集层之前,我们的(3,3,64)输出被展平为矢量(576)。
xxxxxxxxxx
51model.compile(optimizer='adam',
2 loss='sparse_categorical_crossentropy',
3 metrics=['accuracy'])
4
5model.fit(train_images, train_labels, epochs=5)
xxxxxxxxxx
31...
2Epoch 5/5
360000/60000 [==============================] - 15s 258us/sample - loss: 0.0190 - accuracy: 0.9941
xxxxxxxxxx
11test_loss, test_acc = model.evaluate(test_images, test_labels)
xxxxxxxxxx
1110000/10000 [==============================] - 1s 92us/sample - loss: 0.0272 - accuracy: 0.9921
xxxxxxxxxx
11print(test_acc)
xxxxxxxxxx
110.9921
如你所见,我们简单的CNN已经达到了超过99%的测试精度,这几行代码还不错。另一种编写CNN的方式here(使用Keras Subclassing API和GradientTape)。
最新版本:https://www.mashangxue123.com/tensorflow/tf2-tutorials-images-intro_to_cnns.html
英文版本:https://tensorflow.google.cn/beta/tutorials/images/save_and_restore_models
翻译建议PR:https://github.com/mashangxue/tensorflow2-zh/edit/master/r2/tutorials/images/intro_to_cnns.md