Eager Execution 概述

TensorFlow 的 Eager Execution 是一种命令式编程环境，可立即评估操作，无需构建图：操作会返回具体的值，而不是构建以后再运行的计算图。这样能让您轻松地开始使用 TensorFlow 和调试模型，并且还减少了样板代码。要遵循本指南，请在交互式 python 解释器中运行下面的代码示例。

Eager Execution 是一个灵活的机器学习平台，用于研究和实验，可提供：

直观的界面 - 自然地组织代码结构并使用 Python 数据结构。快速迭代小模型和小型数据集。
更轻松的调试功能 - 直接调用操作以检查正在运行的模型并测试更改。使用标准 Python 调试工具进行即时错误报告。
自然控制流程 - 使用 Python 控制流程而不是图控制流程，简化了动态模型的规范。

Eager Execution 支持大多数 TensorFlow 操作和 GPU 加速。

注意：如果启用 Eager Execution，某些模型的开销可能会增加。我们正在改进性能；如果发现问题，请报告错误，并分享您的基准测试结果。

设置和基本用法

升级到最新版本的 TensorFlow：


x
1
from __future__ import absolute_import, division, print_function, unicode_literals
2
3
# pip install tensorflow==2.0.0-alpha0
4
import tensorflow as tf

在Tensorflow 2.0中，默认情况下启用了Eager Execution。


xxxxxxxxxx
1
1
tf.executing_eagerly()


xxxxxxxxxx
1
1
      True

现在您可以运行TensorFlow操作，结果将立即返回：


xxxxxxxxxx
3
1
x = [[2.]]
2
m = tf.matmul(x, x)
3
print("hello, {}".format(m))


xxxxxxxxxx
1
1
      hello, [[4.]]

启用 Eager Execution 会改变 TensorFlow 操作的行为方式(现在它们会立即评估并将值返回给 Python)。tf.Tensor 对象会引用具体值，而不是指向计算图中的节点的符号句柄。由于不需要构建稍后在会话中运行的计算图，因此使用 print() 或调试程序很容易检查结果。评估、输出和检查张量值不会中断计算梯度的流程。

Eager Execution 适合与 NumPy 一起使用。NumPy 操作接受tf.Tensor 参数。TensorFlow 数学运算将 Python 对象和 NumPy 数组转换为 tf.Tensor 对象。tf.Tensor.numpy 方法返回对象的值作为 NumPy ndarray。


xxxxxxxxxx
3
1
a = tf.constant([[1, 2],
2
                 [3, 4]])
3
print(a)


xxxxxxxxxx
3
1
      tf.Tensor(
2
      [[1 2]
3
       [3 4]], shape=(2, 2), dtype=int32)


xxxxxxxxxx
3
1
# Broadcasting support
2
b = tf.add(a, 1)
3
print(b)


xxxxxxxxxx
3
1
      tf.Tensor(
2
      [[2 3]
3
       [4 5]], shape=(2, 2), dtype=int32)


xxxxxxxxxx
2
1
# Operator overloading is supported
2
print(a * b)


xxxxxxxxxx
3
1
      tf.Tensor(
2
      [[ 2  6]
3
       [12 20]], shape=(2, 2), dtype=int32)


xxxxxxxxxx
5
1
# 使用NumPy值
2
import numpy as np
3
4
c = np.multiply(a, b)
5
print(c)


xxxxxxxxxx
2
1
      [[ 2  6]
2
       [12 20]]


xxxxxxxxxx
4
1
# 从张量中获取numpy值：
2
print(a.numpy())
3
# => [[1 2]
4
#     [3 4]]

动态控制流

Eager Execution 的一个主要好处是，在执行模型时，主机语言的所有功能都可用。因此，编写 fizzbuzz很容易（举例而言）：

FizzBuzz问题：举个例子，编写一个程序从1到100.当遇到数字为3的倍数的时候，点击“Fizz”替代数字，5的倍数用“Buzz”代替，既是3的倍数又是5的倍数点击“FizzBuzz”。


xxxxxxxxxx
14
1
def fizzbuzz(max_num):
2
  counter = tf.constant(0)
3
  max_num = tf.convert_to_tensor(max_num)
4
  for num in range(1, max_num.numpy()+1):
5
    num = tf.constant(num)
6
    if int(num % 3) == 0 and int(num % 5) == 0:
7
      print('FizzBuzz')
8
    elif int(num % 3) == 0:
9
      print('Fizz')
10
    elif int(num % 5) == 0:
11
      print('Buzz')
12
    else:
13
      print(num.numpy())
14
    counter += 1


xxxxxxxxxx
1
1
fizzbuzz(15)


xxxxxxxxxx
1
1
1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz 11 Fizz 13 14 FizzBuzz

这段代码具有依赖于张量值的条件并在运行时输出这些值。

构建模型

许多机器学习模型通过组合层来表示。将 TensorFlow 与 Eager Execution 结合使用时，您可以编写自己的层或使用在 tf.keras.layers 程序包中提供的层。

虽然您可以使用任何 Python 对象表示层，但 TensorFlow 提供了便利的基类 tf.keras.layers.Layer。您可以通过继承它实现自己的层，如果必须强制执行该层，在构造函数中设置 self.dynamic=True：


xxxxxxxxxx
17
1
class MySimpleLayer(tf.keras.layers.Layer):
2
  def __init__(self, output_units):
3
    super(MySimpleLayer, self).__init__()
4
    self.output_units = output_units
5
    self.dynamic = True
6
7
  def build(self, input_shape):
8
    # The build method gets called the first time your layer is used.
9
    # 构建方法在第一次使用图层时被调用。
10
    # 在build()上创建变量允许您使其形状取决于输入形状，因此无需用户指定完整形状。 
11
    # 如果您已经知道它们的完整形状，则可以在` __init__()`期间创建变量。
12
    self.kernel = self.add_variable(
13
      "kernel", [input_shape[-1], self.output_units])
14
15
  def call(self, input):
16
    # 覆盖 `call()` 而不是`__call__`，这样我们就可以执行一些记帐。
17
    return tf.matmul(input, self.kernel)

请使用tf.keras.layers.Dense层（而不是上面的MySimpleLayer），因为它具有其功能的超集（它也可以添加偏差）。

将层组合成模型时，可以使用 tf.keras.Sequential 表示由层线性堆叠的模型。它非常适合用于基本模型：


xxxxxxxxxx
4
1
model = tf.keras.Sequential([
2
  tf.keras.layers.Dense(10, input_shape=(784,)),  # must declare input shape
3
  tf.keras.layers.Dense(10)
4
])

或者，通过继承 tf.keras.Model 将模型整理为类。这是一个本身也是层的层容器，允许 tf.keras.Model对象包含其他 tf.keras.Model 对象。


xxxxxxxxxx
14
1
class MNISTModel(tf.keras.Model):
2
  def __init__(self):
3
    super(MNISTModel, self).__init__()
4
    self.dense1 = tf.keras.layers.Dense(units=10)
5
    self.dense2 = tf.keras.layers.Dense(units=10)
6
7
  def call(self, input):
8
    """Run the model."""
9
    result = self.dense1(input)
10
    result = self.dense2(result)
11
    result = self.dense2(result)  # reuse variables from dense2 layer
12
    return result
13
14
model = MNISTModel()

因为第一次将输入传递给层时已经设置参数，所以不需要为tf.keras.Model 类设置输入形状。

tf.keras.layers 类会创建并包含自己的模型变量，这些变量与其层对象的生命周期相关联。要共享层变量，请共享其对象。

Eager 训练

计算梯度

自动微分对于实现机器学习算法（例如用于训练神经网络的反向传播）来说很有用。在 Eager Execution 期间，请使用 tf.GradientTape 跟踪操作以便稍后计算梯度。

tf.GradientTape 是一种选择性功能，可在不跟踪时提供最佳性能。由于在每次调用期间都可能发生不同的操作，因此所有前向传播操作都会记录到“磁带”中。要计算梯度，请反向播放磁带，然后放弃。特定的 tf.GradientTape 只能计算一个梯度；随后的调用会抛出运行时错误。


xxxxxxxxxx
6
1
w = tf.Variable([[1.0]])
2
with tf.GradientTape() as tape:
3
  loss = w * w
4
5
grad = tape.gradient(loss, w)
6
print(grad)  # => tf.Tensor([[ 2.]], shape=(1, 1), dtype=float32)

训练模型

以下示例将创建一个多层模型，该模型会对标准 MNIST 手写数字进行分类。它演示了在 Eager Execution 环境中构建可训练图的优化器和层 API。


xxxxxxxxxx
7
1
# 获取并格式化mnist数据
2
(mnist_images, mnist_labels), _ = tf.keras.datasets.mnist.load_data()
3
4
dataset = tf.data.Dataset.from_tensor_slices(
5
  (tf.cast(mnist_images[...,tf.newaxis]/255, tf.float32),
6
   tf.cast(mnist_labels,tf.int64)))
7
dataset = dataset.shuffle(1000).batch(32)


xxxxxxxxxx
8
1
# 建立模型
2
mnist_model = tf.keras.Sequential([
3
  tf.keras.layers.Conv2D(16,[3,3], activation='relu',
4
                         input_shape=(None, None, 1)),
5
  tf.keras.layers.Conv2D(16,[3,3], activation='relu'),
6
  tf.keras.layers.GlobalAveragePooling2D(),
7
  tf.keras.layers.Dense(10)
8
])

即使没有训练，也可以在 Eager Execution 中调用模型并检查输出：


xxxxxxxxxx
2
1
for images,labels in dataset.take(1):
2
  print("Logits: ", mnist_model(images[0:1]).numpy())


xxxxxxxxxx
1
1
      Logits: [[-1.9521490e-02 2.2975644e-02 2.8935237e-02 2.0388789e-02 -1.8511273e-02 -6.4317137e-05 6.0662534e-03 -1.7174225e-02 5.4899108e-02 -2.8871424e-02]]

虽然 keras 模型具有内置训练循环（使用 fit 方法），但有时您需要更多自定义设置。下面是一个用 eager 实现的训练循环示例：


xxxxxxxxxx
4
1
optimizer = tf.keras.optimizers.Adam()
2
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
3
4
loss_history = []


xxxxxxxxxx
10
1
for (batch, (images, labels)) in enumerate(dataset.take(400)):
2
  if batch % 10 == 0:
3
    print('.', end='')
4
  with tf.GradientTape() as tape:
5
    logits = mnist_model(images, training=True)
6
    loss_value = loss_object(labels, logits)
7
8
  loss_history.append(loss_value.numpy().mean())
9
  grads = tape.gradient(loss_value, mnist_model.trainable_variables)
10
  optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables))


xxxxxxxxxx
5
1
import matplotlib.pyplot as plt
2
3
plt.plot(loss_history)
4
plt.xlabel('Batch #')
5
plt.ylabel('Loss [entropy]')


xxxxxxxxxx
1
1
      Text(0, 0.5, 'Loss [entropy]')

该示例使用了 TensorFlow MNIST 示例中的 dataset.py 模块，请将该文件下载到本地目录。运行以下命令以将 MNIST 数据文件下载到工作目录并准备要进行训练的 tf.data.Dataset：

变量和优化器

tf.Variable 对象会存储在训练期间访问的可变 tf.Tensor 值，以更加轻松地实现自动微分。模型的参数可以作为变量封装在类中。

通过将 tf.Variable 与 tf.GradientTape 结合使用可以更好地封装模型参数。例如，上面的自动微分示例可以重写为：


xxxxxxxxxx
42
1
class Model(tf.keras.Model):
2
  def __init__(self):
3
    super(Model, self).__init__()
4
    self.W = tf.Variable(5., name='weight')
5
    self.B = tf.Variable(10., name='bias')
6
  def call(self, inputs):
7
    return inputs * self.W + self.B
8
9
# 点数约为3 * x + 2的玩具数据集
10
NUM_EXAMPLES = 2000
11
training_inputs = tf.random.normal([NUM_EXAMPLES])
12
noise = tf.random.normal([NUM_EXAMPLES])
13
training_outputs = training_inputs * 3 + 2 + noise
14
15
# 要优化的损失函数
16
def loss(model, inputs, targets):
17
  error = model(inputs) - targets
18
  return tf.reduce_mean(tf.square(error))
19
20
def grad(model, inputs, targets):
21
  with tf.GradientTape() as tape:
22
    loss_value = loss(model, inputs, targets)
23
  return tape.gradient(loss_value, [model.W, model.B])
24
25
# Define:
26
# 1. A model.
27
# 2. Derivatives of a loss function with respect to model parameters.
28
# 3. A strategy for updating the variables based on the derivatives.
29
model = Model()
30
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
31
32
print("Initial loss: {:.3f}".format(loss(model, training_inputs, training_outputs)))
33
34
# Training loop
35
for i in range(300):
36
  grads = grad(model, training_inputs, training_outputs)
37
  optimizer.apply_gradients(zip(grads, [model.W, model.B]))
38
  if i % 20 == 0:
39
    print("Loss at step {:03d}: {:.3f}".format(i, loss(model, training_inputs, training_outputs)))
40
41
print("Final loss: {:.3f}".format(loss(model, training_inputs, training_outputs)))
42
print("W = {}, B = {}".format(model.W.numpy(), model.B.numpy()))

在Eager Execution期间将对象用于状态

使用 TF 1.x的 Graph Execution 时，程序状态（如变量）存储在全局集合中，它们的生命周期由 tf.Session 对象管理。相反，在Eager Execution期间，状态对象的生命周期由其对应的 Python 对象的生命周期决定。

变量是对象

在 Eager Execution 期间，变量会一直存在，直到相应对象的最后一个引用被移除，然后变量被删除。


xxxxxxxxxx
4
1
if tf.test.is_gpu_available():
2
  with tf.device("gpu:0"):
3
    v = tf.Variable(tf.random.normal([1000, 1000]))
4
    v = None  # v no longer takes up GPU memory

基于对象的保存

本节是训练检查点指南的简短版本。

tf.train.Checkpoint 可以将 tf.Variable 保存到检查点并从中恢复：


xxxxxxxxxx
2
1
x = tf.Variable(10.)
2
checkpoint = tf.train.Checkpoint(x=x)


xxxxxxxxxx
3
1
x.assign(2.)   # 为变量分配新值并保存。
2
checkpoint_path = './ckpt/'
3
checkpoint.save('./ckpt/')


xxxxxxxxxx
6
1
x.assign(11.)  # 保存后更改变量。
2
3
# 从检查点恢复值
4
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_path))
5
6
print(x)  # => 2.0

要保存和加载模型，tf.train.Checkpoint 会存储对象的内部状态，而不需要隐藏变量。要记录 model、optimizer 和全局步的状态，请将它们传递到 tf.train.Checkpoint：


xxxxxxxxxx
17
1
import os
2
3
model = tf.keras.Sequential([
4
  tf.keras.layers.Conv2D(16,[3,3], activation='relu'),
5
  tf.keras.layers.GlobalAveragePooling2D(),
6
  tf.keras.layers.Dense(10)
7
])
8
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
9
checkpoint_dir = 'path/to/model_dir'
10
if not os.path.exists(checkpoint_dir):
11
  os.makedirs(checkpoint_dir)
12
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")
13
root = tf.train.Checkpoint(optimizer=optimizer,
14
                           model=model)
15
16
root.save(checkpoint_prefix)
17
root.restore(tf.train.latest_checkpoint(checkpoint_dir))

注意：在许多训练循环中，在调用tf.train.Checkpoint.restore之后创建变量。这些变量将在创建后立即恢复，并且可以使用断言来确保检查点已完全加载。有关详细信息，请参阅训练检查点指南。

面向对象的指标

tf.keras.metrics存储为对象。通过将新数据传递给可调用对象来更新指标，并使用 tf.keras.metrics.result方法检索结果，例如：


xxxxxxxxxx
6
1
m = tf.keras.metrics.Mean("loss")
2
m(0)
3
m(5)
4
m.result()  # => 2.5
5
m([8, 9])
6
m.result()  # => 5.5

自动微分高级内容

动态模型

tf.GradientTape 也可用于动态模型。这个回溯线搜索算法示例看起来像普通的 NumPy 代码，除了存在梯度并且可微分，尽管控制流比较复杂：


xxxxxxxxxx
13
1
def line_search_step(fn, init_x, rate=1.0):
2
  with tf.GradientTape() as tape:
3
    # Variables are automatically recorded, but manually watch a tensor
4
    tape.watch(init_x)
5
    value = fn(init_x)
6
  grad = tape.gradient(value, init_x)
7
  grad_norm = tf.reduce_sum(grad * grad)
8
  init_value = value
9
  while value > init_value - rate * grad_norm:
10
    x = init_x - rate * grad
11
    value = fn(x)
12
    rate /= 2.0
13
  return x, value

自定义梯度

自定义梯度是一种覆盖梯度的简单方法。在正向函数中，定义相对于输入、输出或中间结果的梯度。例如，下面是在反向传播中截断梯度范数的一种简单方式：


xxxxxxxxxx
6
1
@tf.custom_gradient
2
def clip_gradient_by_norm(x, norm):
3
  y = tf.identity(x)
4
  def grad_fn(dresult):
5
    return [tf.clip_by_norm(dresult, norm), None]
6
  return y, grad_fn

自定义梯度通常用于为一系列操作提供数值稳定的梯度：


xxxxxxxxxx
9
1
def log1pexp(x):
2
  return tf.math.log(1 + tf.exp(x))
3
4
def grad_log1pexp(x):
5
  with tf.GradientTape() as tape:
6
    tape.watch(x)
7
    value = log1pexp(x)
8
  return tape.gradient(value, x)
9


xxxxxxxxxx
2
1
# 梯度计算在x = 0时工作正常。
2
grad_log1pexp(tf.constant(0.)).numpy()   # => 0.5

0.5


xxxxxxxxxx
2
1
# 但是，由于数值不稳定，x = 100失败。
2
grad_log1pexp(tf.constant(100.)).numpy()  # => nan

nan

在此处，log1pexp 函数可以通过自定义梯度进行分析简化。下面的实现重用了在前向传播期间计算的tf.exp(x)的值，通过消除冗余计算，变得更加高效：


xxxxxxxxxx
13
1
@tf.custom_gradient
2
def log1pexp(x):
3
  e = tf.exp(x)
4
  def grad(dy):
5
    return dy * (1 - 1 / (1 + e))
6
  return tf.math.log(1 + e), grad
7
8
def grad_log1pexp(x):
9
  with tf.GradientTape() as tape:
10
    tape.watch(x)
11
    value = log1pexp(x)
12
  return tape.gradient(value, x)
13


xxxxxxxxxx
2
1
# 和以前一样，梯度计算在x = 0时工作正常。
2
grad_log1pexp(tf.constant(0.)).numpy()    # => 0.5


xxxxxxxxxx
2
1
# 并且梯度计算也适用于x = 100。
2
grad_log1pexp(tf.constant(100.)).numpy()   # => 1.0

性能

在Eager Execution期间，计算会自动分流到 GPU。如果要控制计算运行的位置，可以将其放在tf.device('/gpu:0') 块（或 CPU 等效块）中：


xxxxxxxxxx
32
1
import time
2
3
def measure(x, steps):
4
  # TensorFlow在第一次使用时初始化GPU，从计时中排除。
5
  tf.matmul(x, x)
6
  start = time.time()
7
  for i in range(steps):
8
    x = tf.matmul(x, x)
9
  # tf.matmul can return before completing the matrix multiplication
10
  # (e.g., can return after enqueing the operation on a CUDA stream).
11
  # The x.numpy() call below will ensure that all enqueued operations
12
  # have completed (and will also copy the result to host memory,
13
  # so we're including a little more than just the matmul operation
14
  # time).
15
  _ = x.numpy()
16
  end = time.time()
17
  return end - start
18
19
shape = (1000, 1000)
20
steps = 200
21
print("Time to multiply a {} matrix by itself {} times:".format(shape, steps))
22
23
# Run on CPU:
24
with tf.device("/cpu:0"):
25
  print("CPU: {} secs".format(measure(tf.random.normal(shape), steps)))
26
27
# Run on GPU, if available:
28
if tf.test.is_gpu_available():
29
  with tf.device("/gpu:0"):
30
    print("GPU: {} secs".format(measure(tf.random.normal(shape), steps)))
31
else:
32
  print("GPU: not found")


xxxxxxxxxx
3
1
      Time to multiply a (1000, 1000) matrix by itself 200 times:
2
      CPU: 0.7741374969482422 secs
3
      GPU: not found

tf.Tensor对象可以复制到不同的设备来执行其操作：


xxxxxxxxxx
9
1
if tf.test.is_gpu_available():
2
  x = tf.random.normal([10, 10])
3
4
  x_gpu0 = x.gpu()
5
  x_cpu = x.cpu()
6
7
  _ = tf.matmul(x_cpu, x_cpu)    # Runs on CPU
8
  _ = tf.matmul(x_gpu0, x_gpu0)  # Runs on GPU:0
9

基准

对于计算量繁重的模型（如在 GPU 上训练的 ResNet50），Eager Execution 性能与 tf.function Execution 相当。但是对于计算量较小的模型来说，这种性能差距会越来越大，并且有很多工作要做，以便为具有大量小操作的模型优化热代码路径。

使用`tf.function`

虽然Eager Execution使开发和调试更具交互性，但TensorFlow 1.x样式图执行在分布式训练，性能优化和生产部署方面具有优势。为了弥补这一差距，TensorFlow 2.0通过tf.function API引入此功能。有关更多信息，请参阅Autograph指南。

最新版本：https://www.mashangxue123.com/tensorflow/tf2-guide-eager.html 英文版本：https://tensorflow.google.cn/beta/guide/eager 翻译建议PR：https://github.com/mashangxue/tensorflow2-zh/edit/master/r2/guide/eager.md