cerebras.modelzoo.data_preparation.data_preprocessing.vsl_finetuning_token_generator#

This module provides the VSLFinetuningTokenGenerator class, which extends the FinetuningTokenGenerator for processing tokenized text data specifically for variable-length sequence summarization (VSLS). The class includes methods for processing chunks of tokenized text, encoding documents for text summarization, and optimizing the representation of tokenized data by merging shorter sequences within a specified maximum sequence length.

Classes

VSLFinetuningTokenGenerator

Token generator for variable-length sequence summarization (VSLS).