Supported PyTorch Learning Rate Schedulers
Supported PyTorch Learning Rate SchedulersΒΆ
- LRScheduler
Cerebras specific learning rate scheduler base class.
- ConstantLR
Maintains a constant learning rate for each parameter group (no decaying).
- PolynomialLR
Decays the learning rate of each parameter group using a polynomial function in the given
decay_steps
.- ExponentialLR
Decays the learning rate of each parameter group by
decay_rate
every step.- InverseExponentialTimeDecayLR
Decays the learning rate inverse-exponentially over time, as described here.
- InverseSquareRootDecayLR
Decays the learning rate inverse-squareroot over time.
- CosineDecayLR
Applies the cosine decay schedule as described here.
- SequentialLR
Receives the list of schedulers that is expected to be called sequentially during optimization process and milestone points that provides exact intervals to reflect which scheduler is supposed to be called at a given step.
- PiecewiseConstantLR
Adjusts the learning rate to a predefined constant at each milestone and holds this value until the next milestone.
- MultiStepLR
Decays the learning rate of each parameter group by gamma once the number of steps reaches one of the milestones.
- StepLR
Decays the learning rate of each parameter group by gamma every
step_size
.- CosineAnnealingLR
Set the learning rate of each parameter group using a cosine annealing schedule, where ππππ₯ is set to the initial lr and πππ’π is the number of steps since the last restart in SGDR.
- LambdaLR
Sets the learning rate of each parameter group to the initial lr times a given function (which is specified by overriding
set_lr_lambda
).- CosineAnnealingWarmRestarts
Set the learning rate of each parameter group using a cosine annealing schedule, where ππππ₯ is set to the initial lr, πππ’π is the number of steps since the last restart and ππ is the number of steps between two warm restarts in SGDR.
- MultiplicativeLR
Multiply the learning rate of each parameter group by the supplied coefficient.
- ChainedScheduler
Chains list of learning rate schedulers.
- CyclicLR
Sets the learning rate of each parameter group according to cyclical learning rate policy (CLR).
- OneCycleLR
Sets the learning rate of each parameter group according to the 1cycle learning rate policy.