TensorFlow Dynamic Loss Scaling
On This Page
TensorFlow Dynamic Loss Scaling¶
Attention
This document presents dynamic loss scaling for TensorFlow. For PyTorch, see PyTorch Dynamic Loss Scaling.
See also
Dynamic Loss Scaling on Cerebras system.
Enabling dynamic loss scaling¶
To enable dynamic loss scaling (DLS) with TensorFlow, use the CS system supported Trainer optimizer.
Trainer¶
The Trainer optimizer builds the train ops based on the given configuration parameters. This optimizer initializes several parameters that apply to DLS, such as initial loss scaling factor, number of steps before changing the loss scale factor and so on. These settings are optimized for CS system.
Parameters¶
params: Input. Datatype dict. Configuration parameters for the Trainer optimizer.
tf_summary: Input. Datatype bool. The flag for summaries. Defaults to False.
mixed_precision: Input. Datatype bool. The flag for mixed precision. Defaults to False.
Example¶
The following is an example showing how to use the Trainer optimizer in your code:
First, create an instance of the Trainer optimizer in the __init__(self) section in your code.
# Model trainer
self.trainer = Trainer(
params=params["optimizer"],
tf_summary=tf_summary,
mixed_precision=params["training"]["mixed_precision"],
)
Then build the train ops.
def build_train_ops(self, total_loss):
"""
Setup optimizer and build train ops.
"""
return self.trainer.build_train_ops(total_loss)
For more details on the CSDynamicLossScale and the Trainer optimizer, refer to the code in the Cerebras Model Zoo repository.
Note
To access the Python code for CSDynamicLossScale and the Trainer optimizer, you will need read permission for Cerebras Model Zoo Git repository.
The
CSDynamicLossScaleobject in Cerebras Graph Compiler (CGC) implements the dynamic loss scaling. See LossScale.py.This
CSDynamicLossScaleobject is used by theTraineroptimizer. See Trainer.py.