Tokenization is a prevalent approach of pre processing text data for training language models;