Sentiments with Large Language Models as multi-class classifications problem

Abstract: Sentiment analysis has been a topic of many researches and interest. In this article, computational steps are shown and worked out to show, how we can have multiple centers in the sentiment classification task. This does not stop the existing works but provides more avenues to open to research applications more. Conceptualise a 5 class classification task for sentiments, what it means, to look in more than three dimensional spaces, and what it means to have more insights. This can be taken as an ML exercise, both practical & theoretical, till it’s real applications are out. A major application is that new words can be assigned sentiment without supervised approach, but only via Trained Large Language models.

1. Introduction

The unsupervised sentiment assignment can be thought of as an n-class classification task, here the classification can be supervised or un-supervised in nature. The value of n now restricts to 2 or 3, till now. However, in this article,we present the case wherein more than two or three class problem can be formulated form given resources without need of extra manual tagging, as was part of tagging into three class, a huge manual work. This does not say that current things are not required, no, it just says this is also an area of research which can give dividends if analysed well, can lead to applications worth solving with the proposed architecture.

2. Unsupervised Sentiment Assignments

The algorithm starts as follows

• 1. Input the text file, the text file is where the sentiments are to be measured of some text in its relative contexts. It can have one sentence, one word or a page, or a book of text data.

• 2. Define some sentiment centers, typically in sentiment analysis tasks, positive, negative, and neutral classes are defined in supervised and even unsupervised classifications same three class have been performed.

• 3. Here as many sentiment epicentres can be created as is required in an applications, the current article lays emphasis will be on 5 class, classification.

2.1. Typical sentiment analysis task-3 classes

• The typical sentiment Analysis Task is a three Class Classification problem, can be either supervised or unsupervised.

The three classes in the typical Sentiment Analysis Tasks are

• 1. Positive Class

• 2. Negative Class

• 3. Neutral Class

2.2 Non-trivial sentiment alignments

Proposed Sentiment Analysis Tasks are non-trivial and are given as follows

• 1. Extreme Positive Class

• 2. Positive Class

• 3. Negative Class

• 4. Extreme Negative Class

• 5. Neutral

2.2. Five class sentiment orientations

Proposed Sentiment Analysis Tasks are

• 1. Extreme Positive Class, happy

• 2. Positive Class, good

• 3. Negative Class, bad

• 4. Extreme Negative Class, sad

• 5. Neutral, neutral

These may not be best assigned 5 classes to be assigned. But a good way to start the research in this area. This is a good way to start this as an AI/ML research.

2.3. Example

• Consider the word “ Nice “ to be assigned to of the 5 spatially distributed classes. This can be some text fragments as well.

• Let us start with the following 5 words as centers of the distributions:

• 1. happy

• 2. good

• 3.neutral

• 4. bad

• 5. sad

• Let us call them Epicenters, for now, for Sentiment Classification task

2.4. Choose the epicentres of sentiment classification task – 5 class problem now

• 1. happy

• 2. good

• 3.neutral

• 4. bad

• 5. sad

2.5. Inputs used are

• LLM Model, these are pretrained

• The LLM model was computed on pretrained GoogleNews-vectors-negative300 data

3. Define the input to be classified: Here, for example, it be word “nice”

• The words similar to Nice are:

• Good, lovely, neat, fantastic, wonderful, terrific, great, awesome, nicer, decent

• These word can be found with Python code

3.1. Most similar words near epicentre 1

• The words similar to good are:

• great , decent , nice , excellent , fantastic , better , solid , lousy ,

• The words similar to happy are:

• glad , pleased , ecstatic , overjoyed , thrilled , satisfied , proud , delighted , excited ,

• These word can be found with Python code

3.2. Most similar words near epicentre 2

• Most similar to bad

• terrible , horrible , Bad , lousy , crummy , horrid , awful , dreadful , horrendous ,

• Most similar to sad,

• saddening , Sad , saddened , heartbreaking , disheartening , saddens_me , distressing , reminders_bobbing ,

• These word can be found with Python code

3.3. Epicentres to clusters

Based on these best similar words, we make clusters with these best words

Change of LLM models shall change the output

3.4. Compute similarity of input word to clusters

Similarity of nice to cluster centre of good, which contains the best words around word good.
The algorithm outputs, 1 as similarity score, as maximum of all these values
great 0.64546573
terrific 0.6552368
decent 0.5993332
nice 1.0
excellent 0.47978145
fantastic 0.6569241
better 0.38781166
solid 0.42754313
lousy 0.3887929

3.5. Compute similarity of input word to clusters

Similarity of nice to cluster centre of happy, which contains the best words around word happy.
The algorithm outputs, 0.5067 as similarity score, as maximum of all these values
glad 0.5067967
pleased 0.31258228
ecstatic 0.3099612
overjoyed 0.2611948
thrilled 0.32813224
satisfied 0.21507472
proud 0.34908974
delighted 0.27841687
excited 0.35698736

3.6. Compute similarity of input word to clusters

Similarly other assignments to other Epicenters are made.
And the word is assigned a sentiment score depending on what is its similarity to a sentiment class and how many sentiment classes there are. The word here assigned to class of good.

Conclusion and Future Work

These are exercises and good way to progress in this field. There are lot of future works which shall progress once a while in these technical notes. Right now take it as an AI exercise. A major application is that new words can be assigned sentiment without supervised approach, but only via Trained Large Language models.

Sentiments with Large Language Models as multi-class classifications problem

1. Introduction

2. Unsupervised Sentiment Assignments

2.1. Typical sentiment analysis task-3 classes

2.2 Non-trivial sentiment alignments

2.2. Five class sentiment orientations

2.3. Example

2.4. Choose the epicentres of sentiment classification task – 5 class problem now

2.5. Inputs used are

3. Define the input to be classified: Here, for example, it be word “nice”

3.1. Most similar words near epicentre 1

3.2. Most similar words near epicentre 2

3.3. Epicentres to clusters

3.4. Compute similarity of input word to clusters

3.5. Compute similarity of input word to clusters

3.6. Compute similarity of input word to clusters

Conclusion and Future Work

Published by Nidhika

Leave a comment Cancel reply

1. Introduction

2. Unsupervised Sentiment Assignments

2.1. Typical sentiment analysis task-3 classes

2.2 Non-trivial sentiment alignments

2.2. Five class sentiment orientations

2.3. Example

2.4. Choose the epicentres of sentiment classification task – 5 class problem now

2.5. Inputs used are

3. Define the input to be classified: Here, for example, it be word “nice”

3.1. Most similar words near epicentre 1

3.2. Most similar words near epicentre 2

3.3. Epicentres to clusters

3.4. Compute similarity of input word to clusters

3.5. Compute similarity of input word to clusters

3.6. Compute similarity of input word to clusters

Conclusion and Future Work

Share this:

Published by Nidhika

Leave a comment Cancel reply