cerebras.modelzoo.data.common.h5_map_dataset.dataset.MLMHDF5Dataset#

class cerebras.modelzoo.data.common.h5_map_dataset.dataset.MLMHDF5Dataset(*args, **kwargs)[source]#

Bases: cerebras.modelzoo.data.common.h5_map_dataset.dataset.HDF5Dataset

Dataset class to handle text preprocessing in bert mlm datasets.

Parameters

config – The config used to configure the dataset.

Methods

generate_sample

load_state_dict

map

state_dict

Attributes

by_sample