Sentiment Analysis, a Quick Review of Essentials

This is first in series of evaluations of sentiment scores and applying these in useful applications. In coming articles you can see some applications of the topics introduced. Todays topic is of introduction of Sentiment Analysis. This is not a basic’s session but a review session.

Sentiment Analysis can be defined in several ways. The limits to  its use has not been set. Basically, for sentiment analysis, the following holds  

1. the sentiment can itself be processed in several ways

2. find the sentiment or opinion orientation as a category

3. the measure of positivity and negativity of text as a measured numerical score

4. to measure the neutrality, as a tag, its measurement and emphasis

5. the text under consideration can be something as short as a tweet or as large as a textual story as well

6. with each sentiment is set a value which itself can be composed in various ways

7. sentiment lexion do contribute in unsupervised approach, but the way this is constructed can be unsupervised or itself supervised, as a possibility

8. sentiments can be used in other computations both in NLP, image data and other textual analytic role

9. sentiment methods once learned, can be used handy in several applications

10. sentiment classifier once learned, can be used to give the sentiment orientations of textual components in short time, as using sentiment lexicon can be a bit clumsy and hence costly in computational time

11. Keyword reductions can fasten the computations of sentiment evaluations

Popular Toolkits in Python for Sentiment Analysis

NLTK has pretrained sentiment measuring toolkit. It is named as Valence Aware Dictionary and Sentiment Reasoner (VADER). VADER libraries have the class named SentimentIntensityAnalyzer, which looks in sentiment related classes, objects and funtions. The method named  polarity_scores in class SentimentIntensityAnalyzer, measures the sentiment score of the text fragments send to it. The output is in form of list of values, in pairs, here is a sample output:

{‘neg’: 0.0, ‘neu’: 0.494, ‘pos’: 0.506, ‘compound’: 0.6249}

The output was for the string “This is an awesome story.”

The outputs reveal the following computations:

  1. Neg – The negativity in text fragment
  2. Pos – The positivity in this text
  3. Neu – The degree of neutrality in this text
  4. Compound – The overall sentiment of the text

But we must know, how these values came in ? VADER was build with help of Amazon Mechanical Turk to get the values of sentiments per word in dictionary. 

Sentiment Analysis – Unsupervised

The technique of obtaining the values of sentiment attributes for a words in current scheme of application, wherein no training is provided to the algorithm is referred to as unsupervised technique of sentiment assignments. The method require human based tagging, judgements and can be biased, depending on the human evaluators who do this. These methods typically amounts to generation of sentiment lexicon, these lexions can be bulky and the lexicons can take time in retrieving. 

Sentiment Analysis – Supervised

The supervised techniques work on a different platform alltogether. Sentiment Analysis with a supervised algorithm, require the following:

  1. Supervised algorithm- The supervised algorithm which can learn the classification task
  2. The classification task- Is it a two class classification for good sentiment or bad sentiment, or positive, negative and neutral sentiments, viz. a three class classification problem, or is it classification of aggregate measure of sentiment taking text as a whole, along with scores.
  3. The training corposa- These are set of pre-labelled data which are used to train and built a supervised model.
  4. The testing corpora- These are set of testing data, which are tagged but are used to compute the accuracy of training when the evaluation of supervised model is performed.

Finally evaluation can be performed on any new text fragment. The sentiment values shown in algorithm are subject classification accuries 

to p% accuracy, as is tested on the data. 

The classification algorithm can be a supervised algorithm such as Naive Bayes or even SVM. Additional as the data is often sparse and high dimensional, often dimentionality reduction is applied as well. 

Sentiment Analysis – Semi-Supervised

Sentiment analysis can be performed in a semi supervised technique as well. Here, a part of algorithm that runs use supervised tagged textual data, while the rest of it is use to measure the classification accuracy as well as to assist in more classification abilities.

These were points for short review on sentiment analysis. These shall go handy in articles on this & related topics

Published by Nidhika

Hi, Apart from profession, I have inherent interest in writing especially about Global Issues of Concern, fiction blogs, poems, stories, doing painting, cooking, photography, music to mention a few! And most important on this website you can find my suggestions to latest problems, views and ideas, my poems, stories, novels, some comments, proposals, blogs, personal experiences and occasionally very short glimpses of my research work as well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: