Sentiments with Large Language Models as multi-class classifications problem

 

Abstract: Sentiment analysis has been a topic of many researches and interest. In this article, computational steps are shown and worked out to show, how we can have multiple centers in the sentiment classification task. This does not stop the existing works but provides more avenues to open to research applications more. Conceptualise a 5 class classification task for sentiments, what it means, to look in more than three dimensional spaces, and what it means to have more insights. This can be taken as an ML exercise, both practical & theoretical, till it’s real applications are out. A major application is that new words can be assigned sentiment without supervised approach, but only via Trained Large Language models.

 

1. Introduction

The unsupervised sentiment assignment can be thought of as an n-class classification task, here the classification can be supervised or un-supervised in nature. The value of n now restricts to 2 or 3, till now. However, in this article,we present the case wherein more than two or three class problem can be formulated form given resources without need of extra manual tagging, as was part of tagging into three class, a huge manual work. This does not say that current things are not required, no, it just says this is also an area of research which can give dividends if analysed well, can lead to applications worth solving with the proposed architecture.

 

2. Unsupervised Sentiment Assignments

The algorithm starts as follows

• 1. Input the text file, the text file is where the sentiments are to be measured of some text in its relative contexts. It can have one sentence, one word or a page, or a book of text data.

• 2. Define some sentiment centers, typically in sentiment analysis tasks, positive, negative, and neutral classes are defined in supervised and even unsupervised classifications same three class have been performed.

• 3. Here as many sentiment epicentres can be created as is required in an applications, the current article lays emphasis will be on 5 class, classification.

2.1. Typical sentiment analysis task-3 classes

• The typical sentiment Analysis Task is a three Class Classification problem, can be either supervised or unsupervised.

The three classes in the typical Sentiment Analysis Tasks are

• 1. Positive Class

• 2. Negative Class

• 3. Neutral Class

2.2 Non-trivial sentiment alignments

Proposed Sentiment Analysis Tasks are non-trivial and are given as follows

• 1. Extreme Positive Class

• 2. Positive Class

• 3. Negative Class

• 4. Extreme Negative Class

• 5. Neutral

2.2. Five class sentiment orientations

Proposed Sentiment Analysis Tasks are

• 1. Extreme Positive Class, happy

• 2. Positive Class, good

• 3. Negative Class, bad

• 4. Extreme Negative Class, sad

• 5. Neutral, neutral

These may not be best assigned 5 classes to be assigned. But a good way to start the research in this area. This is a good way to start this as an AI/ML research.

2.3. Example

• Consider the word “ Nice “ to be assigned to of the 5 spatially distributed classes. This can be some text fragments as well.

• Let us start with the following 5 words as centers of the distributions:

• 1. happy

• 2. good

• 3.neutral

• 4. bad

• 5. sad

• Let us call them Epicenters, for now, for Sentiment Classification task

2.4. Choose the epicentres of sentiment classification task – 5 class problem now

• 1. happy

• 2. good

• 3.neutral

• 4. bad

• 5. sad

2.5. Inputs used are

• LLM Model, these are pretrained

• The LLM model was computed on pretrained GoogleNews-vectors-negative300 data

3. Define the input to be classified: Here, for example, it be word “nice”

• The words similar to Nice are: 

• Good, lovely, neat, fantastic, wonderful, terrific, great, awesome, nicer, decent

• These word can be found with Python code

3.1. Most similar words near epicentre 1

• The words similar to  good are:

• great ,  decent , nice , excellent , fantastic , better , solid , lousy ,

• The words similar to happy are:

• glad , pleased , ecstatic , overjoyed , thrilled , satisfied , proud , delighted ,  excited ,

• These word can be found with Python code

3.2. Most similar words near epicentre 2

• Most similar to bad

• terrible , horrible , Bad , lousy , crummy , horrid , awful , dreadful , horrendous ,

• Most similar to sad,

• saddening , Sad , saddened , heartbreaking , disheartening , saddens_me , distressing , reminders_bobbing ,

• These word can be found with Python code

3.3. Epicentres to clusters

Based on these best similar words, we make clusters with these best words

Change of LLM models shall change the output

3.4. Compute similarity of input word to clusters

  • Similarity of nice to cluster centre of good, which contains the best words around word good.
  • The algorithm outputs, 1 as similarity score, as maximum of all these values
  • great 0.64546573
  • terrific 0.6552368
  • decent 0.5993332
  • nice 1.0
  • excellent 0.47978145
  • fantastic 0.6569241
  • better 0.38781166
  • solid 0.42754313
  • lousy 0.3887929

3.5. Compute similarity of input word to clusters

  • Similarity of nice to cluster centre of happy, which contains the best words around word happy.
  • The algorithm outputs, 0.5067 as similarity score, as maximum of all these values
  • glad 0.5067967 
  • pleased 0.31258228 
  • ecstatic 0.3099612 
  • overjoyed 0.2611948 
  • thrilled 0.32813224 
  • satisfied 0.21507472 
  • proud 0.34908974 
  • delighted 0.27841687 
  • excited 0.35698736

3.6. Compute similarity of input word to clusters

  • Similarly other assignments to other Epicenters are made.
  • And the word is assigned a sentiment score depending on what is its similarity to a sentiment class and how many sentiment classes there are. The word here assigned to class of good.

Conclusion and Future Work

These are exercises and good way to progress in this field. There are lot of future works which shall progress once a while in these technical notes. Right now take it as an AI exercise. A major application is that new words can be assigned sentiment without supervised approach, but only via Trained Large Language models.

Published by Nidhika

Hi, Apart from profession, I have inherent interest in writing especially about Global Issues of Concern, fiction blogs, poems, stories, doing painting, cooking, photography, music to mention a few! And most important on this website you can find my suggestions to latest problems, views and ideas, my poems, stories, novels, some comments, proposals, blogs, personal experiences and occasionally very short glimpses of my research work as well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: