# Sentiment analysis

## February 06, 2019

### Summary

Maybe you want to see how people are responding to a story or blog of yours – do they love it or hate it; does it make them excited? Maybe you want to detect the tone of people mentioning your company or product. Or perhaps you want to keep tabs on the mood of stocks in order to make money. We can use sentiment analysis for all of these things.

In this multi-part series, we will look at different methods of sentiment and emotion analysis in both Python and R. We will compare performance on a standard dataset, and also scrape our own live tweets for analysis. Finally, we will check performance on stock-related text snippets from news headlines and stocktwits.

Currently if you Google ‘Python sentiment analysis package’, the top results include textblob and NLTK. However, both of these use Naive Bayes models, which are pretty weak. Another option is the VADER lookup dictionary, which has a pre-set score for a number of words.

We can also train our own neural network on data, and add our own custom data – adding value to our creation. Sure, you can also train your own Naive Bayes model – which I’ll show you how to do – but for reasons we’ll get in to, it doesn’t work super well.

### Classic models

Let’s begin with the simple model. The simplest model is a lookup dictionary. You take your words, then check if they are in your dictionary. If you find the word in the dictionary, return the sentiment. Here is an example of using the textblob lookup dictionary:

This will output:

We can see it has polarity (sentiment – +1 is most positive, -1 most negative), subjectivity (0 is completely objective, i.e. factual; 1 is completely subjective, i.e. an opinion), and assessments. The ‘assessments’ are each chunk textblob is using to assess the sentence. The first phrase has a modifier – ‘not’ – which changes the score of the word ‘good’. ‘not’ will flip the sign of the polarity (sentiment), and also multiplies it by 0.5 (because the original score for ‘good’ is 0.7 according to the dictionary). The last thing which is None for ‘not good’ and ‘mood’ for the smiley is the ‘semantic label’. For emoticons in textblob, this is labeled a ‘mood’, for an exclamation mark in parenthesis, like (!), this is considered irony (making that assessment completely subjective).

As you can tell, the default sentiment analysis in textblob is very rule-based.

### Next step up: Naive Bayes

The next step up is to use the Naive Bayes model. This ends up following the equation:

where $\hat{y}_c$ is the predicted probability of the class $c$, $P(y_c)$ is the probability of class $c$ in the training dataset, and $P(x_i \vert y_c)$ is the probability of seeing a word $x_i$ given the class $c$. The big Pi symbol ($\prod$) means we multiply all these $P(x_i \vert y_c)$ values together for all the words in our dataset.

Another way of writing this (identical to the sklearn explanation) is:

where we take the largest probability out of our predictions, and use that as our class prediction. A detail which can make this incorrect is if we have two classes, we can set the threshold anywhere between 0 and 1 to choose our prediction, meaning our predicted class won’t always be the max value. This is related to ROC/AUC. Want to learn more about setting the best threshold for text classification/binary sentiment analysis? Sign up for the email list to get notified when I publish more materials, including those on ROC/AUC and training your own sentiment analysis classifiers:

#### Small detail:

Here is the sentiment dictionary used the the textblob library. Textblob adds a bit of complexity with ‘assessments’, which are words with modifiers like ‘not’. I’m not sure where this is in the docs exactly, but in the source code, it talks about it here:

#### Small detail: multiple entries in lookup dictionary

Scores for words with multiple entries in the dictionary are averaged. This can be verified by checking out the subjectivity of the word accurate, which has 3 versions in the lookup dictionary. The subjectivity score is 0.63333 (the average of 0.5, 0.6, and 0.8 – the 3 values for ‘accurate’ in the lookup dictionary). Here is example code you can use to verify this (I ran it in an IPython shell):

Updated on

### Cannabis recommender and science

Studying chemistry and effects of cannabis, and building a recommender for strain selection. Continue reading

#### Doing data science => Grupo Bimbo deliveries

Published on September 15, 2016

#### DIY Pumps and Timers for Hydroponics

Published on August 05, 2016