Collect statistics of the dataset.
data_arr (numpy.ndarray) – Numpy array containing the dataset.
args (ValidationArgs) – Arguments for verifying HDF5 dataset.
previous
cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.chunk
next
cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.create_features_auto_lm