cerebras.modelzoo.data.nlp.gpt.InferenceDataProcessor#

This module defines the InferenceDataProcessor class, its subclasses and the EvalHarnessDataset class for preprocessing and loading eval harness data

Functions

get_token_ids

Get encoded token ids from a string using the specified tokenizer.

tokenize_stop_words

Helper to construct a list of stop token sequences from the given list of stop words using the specified tokenizer.

Classes

EvalHarnessDataset

InferenceDataProcessor

InferenceDataProcessorBCEH

Subclass for processing BigCode data, i.e. bigcode_eh requests.

InferenceDataProcessorGU

Subclass for processing EEH generate_until requests.

InferenceDataProcessorLL

Subclass for processing EEH loglikelihood requests.

RequestType

An enumeration.