Konubinix' opinionated web of thoughts

Bag-of-Words Model

Fleeting

Introduction to the Bag-of-Words (BoW) Model - PyImageSearch

overfitting in text data is wanted

important point to note is that when it comes to text data, we would most likely want our model to overfit the training data for the best results.

https://pyimagesearch.com/2022/07/04/introduction-to-the-bag-of-words-bow-model/

because when it comes to text data, your training text data becomes your unquestioned commandment.

https://pyimagesearch.com/2022/07/04/introduction-to-the-bag-of-words-bow-model/

text data differs greatly from image data. An assumption we consider while making the overfitting statement is that the training data will cover almost all instances of a word appearing in different contexts.

https://pyimagesearch.com/2022/07/04/introduction-to-the-bag-of-words-bow-model/