cerebras.modelzoo.data.nlp.gpt.GptHDF5DataProcessor.GptHDF5DataProcessor#
- class cerebras.modelzoo.data.nlp.gpt.GptHDF5DataProcessor.GptHDF5DataProcessor(config)[source]#
Bases:
cerebras.modelzoo.data.common.HDF5IterableDataProcessor.HDF5IterableDataProcessor
A HDF5 dataset processor for GPT pre-training. Loads data from HDF5 files.
- Parameters
config (cerebras.modelzoo.data.nlp.gpt.config.GptHDF5DataProcessorConfig) – The configuration object for the GPT HDF5 data processor.
Methods
collate_fn
Classmethod to create the dataloader object.
- create_dataloader()#
Classmethod to create the dataloader object.