Understanding Naive Bayes in the real world

Gautham Santhosh

2 min readFeb 6, 2020

A classifier is a machine learning model that is used to discriminate different objects based on certain features.

Microsoft - Weed out fake marketing leads.

MS wants to connect with potential customers, our marketers and sellers.

people fill out online forms with fake names, gibberish, or even profanity.

Spam detection

A text can occur in a span or not so using a simple text detector will be bad.

But we can use the number of occurrence of each word in spam or not.

Hence we calculate probability of a word in spam

Hence this is a very common problem of mainly categorings things in real world.

Algorithm

1. Calculate Probability of word being a type
2. Calculate Probability of word being not that
3. Then we multiply each of these and the largest value is the answer

Like in this image used to say if a word is related to sports or not

1 is added to every count so prob is not zero

Weather forecast

Calculate the probability of day being particular weather is one of the use cases of Naive Bayes.

Simple Python code

There is no point in reinventing the wheel atm. You can implement a basic one in 3,4 lines of python using sklearn.

Thanks for reading and thank you Ananya Agrawal for helping out.

You can view the thread in Twitter

References

https://scikit-learn.org/stable/modules/naive_bayes.html

https://www.researchgate.net/publication/290685616_Weather_Forecasting_Using_Naive_Bayesian

https://monkeylearn.com/blog/practical-explanation-naive-bayes-classifier/

Microsoft IT Showcase

Discover the inside story of how Microsoft does IT. IT Showcase shares the blueprint of Microsoft's reinvention…

www.microsoft.com

I am working on a customer feedback tracker visit https://www.featuremonkey.com/ which is a great alternative for canny, hellonext, uservoice which can be used for feature request tracking , internal feedback, public roadmap etc