cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.DatasetStats#
- class cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.DatasetStats[source]#
Bases:
object
DatasetStats(num_sequences: int, num_tokens: int, detokenized_bytes: int, detokenized_chars: int, non_pad_tokens: int, loss_valid_tokens: int)
Methods
Attributes
num_sequences
num_tokens
detokenized_bytes
detokenized_chars
non_pad_tokens
loss_valid_tokens
- __init__(num_sequences: int, num_tokens: int, detokenized_bytes: int, detokenized_chars: int, non_pad_tokens: int, loss_valid_tokens: int) None #