Detokenizer for wikitext. Used for special handling of data for substrings.
string (str) – String to detoknize before tokenization.
Detokenized string
previous
cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.verify_saved_hdf5_files_mp
next
cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.DatasetStats