Thank you for your feedback!
cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.process_dataset#
- cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.process_dataset(files, dataset_processor, processes)[source]#
Process a dataset and write it into HDF5 format.
- Parameters
files (list) – List of files to process.
dataset_processor – Class containing methods that specify how the dataset will be processed and written into HDF5 files.
processes (int) – Number of processes to use.
- Returns
- Dictionary containing results of execution, specifically as number of
processed, discarded, and successful files as well as number of examples from all processes.
Was this information helpful?
Thank you for your feedback!
- NO
- YES
Cancel
Submit