Model Zoo API
Cerebras PyTorch API
Using Hugging Face datasets for auto-regressive LM
Creating HDF5 dataset for GPT models
Shuffling Samples for HDF5 dataset of GPT models
Optimizing SlimPajama dataset pre-processing
Data preprocessing scripts
#
Using Hugging Face datasets for auto-regressive LM
Creating HDF5 dataset for GPT models
Shuffling Samples for HDF5 dataset of GPT models
Optimizing SlimPajama dataset pre-processing