cerebras.modelzoo.data.nlp.dpo.DPOSyntheticDataset.DPOSyntheticDataProcessor#
- class cerebras.modelzoo.data.nlp.dpo.DPOSyntheticDataset.DPOSyntheticDataProcessor[source]#
Bases:
object
Synthetic dataset generator.
- Parameters
params (dict) – dict containing training input parameters for creating dataset.
Expects the following fields:
“num_examples (int): Number of training examples
“vocab_size” (int): Vocabulary size
“max_seq_length (int): Maximum length of the sequence to generate
“batch_size” (int): Batch size.
“shuffle” (bool): Flag to enable data shuffling.
“shuffle_seed” (int): Shuffle seed.
Methods
Create dataloader.