Sentiment Analysis of Twitter Text And Visualization Methods
Twitter Sentiment Analysis and Visualization Methods
Twitter has connected people to form rich online communities engaged in energetic, informative, and sometime misleading discussions. Imagine a basketball arena full of people huddled in smaller subgroups. Each subgroup is engaged in passionate conversations, talking away with each other and over each other. Some of the folks are holding calm conversations about their daily lives, others scream at each other in political protests, while in the right corner, next to the benches, a smaller group of folks recite poems leisurely. Conversations on twitter are rich in information, emotion and opinion, which lands tweets as a superb study for opinion mining and sentiment analysis.
Motivation:
The main goal for this blog is to perform a simple sentiment analysis to gauge the overarching sentiment (positive, negative, neutral) for a collection of tweets, and explore several visualization methods to gain additional insight from the analysis.
Here is what we will cover:
- Data: familiarization, cleanup, and helper functions.
- Sentimental Analysis: method and results review.
- User Mentions: collect all the user mentioned in tweets and assign an aggregate sentiment value.
a. Histogram Plot of top 50 mentions.
b. Wordcloud Visualization: build a wordcloud for users mentioned in tweets. - Hashtags: collect all the hashtags used in tweets and assign an aggregate sentiment value.
a. Histogram Plot of top 50 mentions.
b. Wordcloud Hashtags Visualization: build a wordcloud for hashtags used in tweets. - Wordcloud Methodology: code for implementing custom colors and word frequency map.
- Relational Graph Visualization: build a relational graph for users mentioned in tweets.
- Final Discussion
Data
Data Review
Data was collected with the twitter API and provided by one of my project classes. The topic was specifically focused on twitter conversations about the Infrastructure Bill proposed by the Biden administration. Tweets were filtered around variations of hashtags similar to #InfrastructureBill. Keep the context in mind as we analyze the results.
Here are a few example tweets from the data set.
Data Cleanup
Each tweet was cleaned-up to improve sentiment classification. Cleanup included the following procedures:
- Replace emojis (Unicode) with words. ex: “\U0001F60A” -> smiling face with smiling eyes
- Split joined words. ex: HappyDay -> happy day
- Make all words lowercase. ex: HAPPY -> happy
- Remove hash sign from hashtags ex: #happy -> happy
- Remove space \n and \r characters.
- Remove mentions. ex: @michael published a book -> published a book
- Remove websites.
- Remove numbers and punctuation.
- Remove non-English words.
- Remove single letters and empty spaces.
The following function was used to clean up tweets. It is a bit messy and long, but it did the job.
Sentimental Analysis
Sentiment analysis was performed with NLTK python library, which uses VADER (Valence Aware Dictionary and Sentiment Reasoner) for opinion mining. VANDER is a pre-trained lexicon-based model that is used as a classifier. NLTK’s SentimentIntensityAnalyzer returns four float scores: negative, neutral, positive and compound. The code below shows how the classifier was applied to a pandas dataframe to extract scores for each tweet.
from nltk.sentiment.vader import SentimentIntensityAnalyzer
To interpret results, we can analyze each score and its rating value. For example, a tweet will have a score form 0 to 1 for each of the categories: negative, positive, neutral. The score can be interpreted as a probability that the text falls in one of the three categories. Other way to interpret the results is to look at the compound score, or the overall normalized compounding lexicon rating. It is a simple way to interpret the results and understand how positive/negative a tweet is. Compound score is interpreted as follows:
- negative: -1 most negative to -0.5 negative
- neutral: -0.5 to 0.5
- positive: 0.5 positive, 1 most positive.
The compounding score distribution over all tweets in the dataset is shown below. The y axis shows how many times each score was assigned to tweets. Most tweets fall right around the neutral score range, with a slight skew toward the positive compound score. The positive bias in tweets is pretty encouraging given today’s heated twitter climate.
User Mentions
Users can ‘mention’ or include other users in their tweets. For example, the tweet below mentions JoeManchinWV and SenateGOP.
The following function was used to extract mentions from tweets with the regex library.
def getTwitterMentions(text:str)->List[str]: """
twitter mentions are prefixed with an @
""" mentionList = re.findall(r'(?<=@)\w+',text) return mentionList
The sentiment score for each user mentioned in tweets was aggregated from all tweets the user was mentioned in. The resulting score can be interpreted as the general sentiment of tweets that a user is mentioned in. For example, if user owillis was mentioned in 10 tweets, the compounded score is averaged to get the final sentiment associated with owillis.
Top 50 Mentions Histogram
Top 50 mentions were filtered by count, or how often a user was mentioned in other tweets. The x axis is the number of times a user was mentioned in tweets, while the colors show the average sentiment of all tweets that the user was mentioned in.
Mentions Wordcloud Visualization
The size of each words shows how many times the word is used in a body of text, in this case, how many times each mention was used in tweets. These are the most talked about users and the general tweet sentiment that the users are mentioned in (reference the color legend):
Hashtags
Tweeter users can also include ‘hashtags’ or topics that many users are interested in. Hashtags are prefixed with a hash (#) and can be extracted from a tweet using the following code.
def getTwitterHashtags(text:str)->List[str]: """
twitter hashtags are prefixed with an #
""" hashtagList = re.findall(r'#(\w+)',text) return hashtagList
Hashtags are great for opinion mining because these are the topics that users agree to show interest in and talk about. Hashtag sentiments were derived the same way as mentions. The compounding score for a hashtag is an aggregate of the scores of all the tweets that used the specific hashtag.
Top 50 Hashtag Histogram
Top 50 hashtags were filtered by count, or how often a hashtag was used. The x axis is the number of times a hashtag was used, while the colors show the average sentiment of all tweets that used the hashtag.
Hashtag Wordcloud Visualization
These are the most talked about topics and the general sentiment of the tweets that use these hashtags:
Wordcloud Methodology
Wordcloud can be a fun visualization method, the size of the word shows how often the word is used in a certain body of text. It is not always the best visualization method. For example, wordcloud makes it hard to compare words if they are very similar in size or are on opposite sides of the figure. The benefits of wordcloud is that it can be a fun visualization, easy to interpret at a high level, and can give our minds a break from reading charts and graphs.
The default implementation of wordcloud includes text pre-processing. Specifically, it transforms all inputs into lowercase before generating a frequency-word map. Since hashtags are case sensitive, this is not the behavior we want. There are two things that we want to change from the default behavior.
- Add a color scheme of our choosing so we can represent sentiment rating.
- Generate our own word frequency map.
Here is the sample code that will allow us to implement the custom functionality we want:
from wordcloud import WordCloudwcObj = WordCloud(
collocations=False,
width=800, height=400,
background_color='white',
max_words=2000
)# generate a custom map for hashtag frequency
freqData = dict(zip(df["hashtag"], df["count"]))
wcObj.generate_from_frequencies(freqData)fig = plt.figure()# display the cloud
plt.imshow(wcObj.recolor(
color_func=generateColor,
random_state=3
),
interpolation="bilinear"
)plt.title(title)plt.show()
To generate custom colors, the function will be defined separately:
def generateColor(word,
font_size,
position,
orientation,
random_state=None,
**kwargs) -> str:
#-- code to map the colors ---
# final output needs to be str with the format
# "rgb(50%,92%,62%)" wordScore =df[df["hashtags"]==word].compound_score.values
colorTuple = plt.cm.cool(wordScore)
color = "rgb(%d%%,%d%%,%d%%)"%colorTuple
return color
Relational Graph Visualization
The to-go-to relational graphs of tweeter data focus on users who mention each other. For example, if user A mentions user B, then there is an edge between user A and B. However, the typical user mentions may not express much of a relation between users. If user B is the president, and user A is a radio talk host, there may be no connection between the two users other than the fact that user A has an interest in talking about user B.
For this analysis, I decided to focus on how tweeter users group other users through mentions. In other words, how common is it for certain users to be mentioned together in the same tweet. The insight gathered in modeling the graph with this method is that it it builds a relational graph from a twitter topic perspective. For example, when talking about the Infrastructure Bill, or any other topic, certain users will be mentioned more often together. The most commonly mentioned users become the main actors for that topic, and the resulting relational graphs organizes the actors into different teams or clusters of main actors.
Relational graph of user mentions for a selected set of users.
Relational graph of user mentions for the top 20 users based on their degree.
I used Argo Lite to plot the relational graph. You can include an interactive plot of the graph in your Jupyter or calab notebook with the code below.
Final Discussion
Hashtag topics like StopHate are classified as negative along with other topics like LockTemAllUp. Sentiment classification of hashtags and mentions reveal the context that certain hashtags/mentions are used in, and not necessarily if the specific hashtag/mention itself is a positive or a negative word. Some topics that are rated as negative may be part of progressive social movements with broadly beneficial impacts, however the conversations around the issues may be charged with negative language which can influence the sentiment score.
Sentiment classification is difficult to perform for many reasons. Humor, sarcasm, and cultural nuances are hard to model. The error margin is great enough to consider the results as a general gauge of sentiment.
Twitter datasets are readily available on Kaggle and are easy to compile using twitter API. Tweets are great for exploring text analysis, opinion mining, and visualization tools. This blog presented a simple way to extract sentiments from tweets using the NLTK library, and explored several visualization methods and techniques.
Resources
A fun tweeter sentiment web app:
NLTK Sentiment Analysis