cerebras.modelzoo.data.nlp.bert.BertSumCSVDataProcessor.BertSumCSVDataProcessorConfig#

class cerebras.modelzoo.data.nlp.bert.BertSumCSVDataProcessor.BertSumCSVDataProcessorConfig(*args, **kwargs)[source]#

Methods

`check_for_deprecated_fields`
`check_literal_discriminator_field`
`copy`
`get_orig_class`
`get_orig_class_args`
`get_vocab_file`
`model_copy`
`model_post_init`
`post_init`

Attributes

`batch_size`	The batch size.
`data_dir`	Path to the data files to use.
`discriminator`
`discriminator_value`
`do_lower`	Flag to lower case the texts.
`drop_last`	Whether to drop last batch of epoch if it's an incomplete batch.
`mask_whole_word`	Flag to whether mask the entire word.
`max_cls_tokens`
`max_sequence_length`
`model_config`
`num_workers`	The number of PyTorch processes used in the dataloader.
`pad_id`
`persistent_workers`	Whether or not to keep workers persistent between epochs.
`prefetch_factor`	The number of batches to prefetch in the dataloader.
`shuffle`	Whether or not to shuffle the dataset.
`shuffle_buffer`	Buffer size to shuffle samples across.
`shuffle_seed`	The seed used for deterministic shuffling.
`vocab_file`	Path to the vocabulary file.
`data_processor`

shuffle_buffer = None#: Buffer size to shuffle samples across. If None and shuffle is enabled, 10*batch_size is used.

persistent_workers = True#: Whether or not to keep workers persistent between epochs.

drop_last = True#: Whether to drop last batch of epoch if it’s an incomplete batch.

cerebras.modelzoo.data.nlp.bert.BertSumCSVDataProcessor.BertSumCSVDataProcessor

cerebras.modelzoo.data.nlp.bert.BertTokenClassifierDataProcessor