cerebras.modelzoo.data.nlp.bert.BertTokenClassifierDataProcessor.BertTokenClassifierDataProcessorConfig#
- class cerebras.modelzoo.data.nlp.bert.BertTokenClassifierDataProcessor.BertTokenClassifierDataProcessorConfig(*args, **kwargs)[source]#
Bases:
cerebras.modelzoo.config.data_config.DataConfig
Methods
check_for_deprecated_fields
check_literal_discriminator_field
copy
get_label_vocab_file
get_orig_class
get_orig_class_args
get_vocab_file
model_copy
model_post_init
post_init
Attributes
attn_mask_pad_id
The batch size.
Path to the data files to use.
discriminator
discriminator_value
Flag to lower case the texts.
Whether to drop last batch of epoch if it's an incomplete batch.
include_padding_in_loss
input_pad_id
Path to json file with class name to class index.
labels_pad_id
Flag to whether mask the entire word.
max_sequence_length
model_config
The number of PyTorch processes used in the dataloader.
Whether or not to keep workers persistent between epochs.
The number of batches to prefetch in the dataloader.
Whether or not to shuffle the dataset.
Buffer size to shuffle samples across.
The seed used for deterministic shuffling.
Path to the vocabulary file.
data_processor
- data_dir = Ellipsis#
Path to the data files to use.
- batch_size = Ellipsis#
The batch size.
- vocab_file = Ellipsis#
Path to the vocabulary file.
- label_vocab_file = None#
Path to json file with class name to class index.
- mask_whole_word = False#
Flag to whether mask the entire word.
- do_lower = False#
Flag to lower case the texts.
- shuffle = True#
Whether or not to shuffle the dataset.
- shuffle_seed = None#
The seed used for deterministic shuffling.
- shuffle_buffer = None#
Buffer size to shuffle samples across. If None and shuffle is enabled, 10*batch_size is used.
- num_workers = 0#
The number of PyTorch processes used in the dataloader.
- prefetch_factor = 10#
The number of batches to prefetch in the dataloader.
- persistent_workers = True#
Whether or not to keep workers persistent between epochs.
- drop_last = True#
Whether to drop last batch of epoch if it’s an incomplete batch.