cerebras.modelzoo.data.nlp.gpt.HuggingFaceDataProcessorEli5.HuggingFaceDataProcessorEli5Config#
- class cerebras.modelzoo.data.nlp.gpt.HuggingFaceDataProcessorEli5.HuggingFaceDataProcessorEli5Config(*args, **kwargs)[source]#
-
Methods
check_for_deprecated_fields
check_literal_discriminator_field
copy
get_orig_class
get_orig_class_args
model_copy
model_post_init
post_init
Attributes
batch_size
Batch size.
data_dir
discriminator
discriminator_value
drop_last
If True and the dataset size is not divisible by the batch size, the last incomplete batch will be dropped.
model_config
num_workers
How many subprocesses to use for data loading.
persistent_workers
If True, the data loader will not shutdown the worker processes after a dataset has been consumed once.
prefetch_factor
Number of batches loaded in advance by each worker.
shuffle
Flag to enable data shuffling.
shuffle_buffer
Size of shuffle buffer in samples.
shuffle_seed
Shuffle seed.
split
data_processor