• Skip to main content

Fervent - Learn with Distinction

Rigorous Courses, Backed by Research, Taught with Simplicity.

  • Home
  • Courses
  • Private Tuition
  • Articles
  • Contact Us
Sentiment Analysis in Finance – An Overview

Sentiment Analysis in Finance – An Overview

December 23, 2020 By Vash Leave a Comment

In this article, we’re going to get an overview of sentiment analysis in finance.

Just a quick recap though, recall that we said Natural Language Processing (NLP) is essentially just a set of computational linguistics techniques, which help us gain insights from text (or indeed any other language) data.

What is Sentiment Analysis?

Sentiment analysis, at least in finance, essentially involves quantifying and exploiting sentiment – or emotions – for some sort of objective.

The objective could be for investment purposes. Or for understanding firm performance, including profitability or cash flow management, the likelihood of fraud, etc.

Ultimately though, it’s about quantifying sentiment, and exploiting it by linking it to firms.

Slide showcasing a simple definition of sentiment analysis

So it’s a case of firstly estimating the level of sentiment that a firm may have. And then linking that sentiment to other attributes or characteristics of firms. After that, we can explore whether there’s some sort of relationship with sentiment and other firm characteristics.

This holds regardless of which sentiment analysis tool you end up using.

For example, one could link the “positivity” of firms’ annual reports to firms’ profitability. This would allow us to explore whether for instance, positivity has some relationship with profitability.

What is sentiment in the context of sentiment analysis?

Now, if we think about what we mean by sentiment itself, it can include things like:

  • positive sentiment,
  • negative sentiment

It could also include things “uncertainty”, “narcissism”, “anxiety”, or “panic”, to name a few.

Essentially, it includes all of the things that you might think about when you think of the word sentiment.

It’s just that, rather than thinking about it in terms of sentiment of humans, we’re thinking about it in terms of sentiment of firms.

For example, sentiment can include the level of positivity of firms, or the level of negative tone of firms.

It could also be about the amount of uncertainty that’s being portrayed by firms. Or whether a firm’s CEO is narcissistic or humble. Or whether a firm is anxious about its future prospects, or even more broadly, whether an economy is currently facing some sort of panic.

It’s very much about the sentiment or emotion that you know and are familiar with as a human.

It’s just that we’re applying it either to the firm level context, or to an aggregate macro level context.

And rather than “feeling” that emotion / sentiment, it’s a case of assigning a sentiment score to a firm and using that in the sentiment analysis.


Related Course

This Article features concepts that are covered extensively in our course on Investment Analysis with Natural Language Processing (NLP).

If you’re interested in learning how to apply sentiment analysis for investment while working with real world data, you should definitely check out the course.


Applying Sentiment Analysis

In terms of how we actually go about using sentiment once we’ve estimated it, it’s a case of first identifying whether or not the overall sentiment matters.

If a specific type of sentiment does in fact matter, then we can use it to create some sort of trading strategy, for example.

But importantly, when we’re trying to identify whether sentiment matters, we’re not talking about whether we think sentiment matters or whether you think sentiment matters.

It’s not about personal opinions or subjective thought, or subjective debate.

It’s about gaining deeper insights by letting the data determine whether or not sentiment matters.

Data driven validation

In other words, it’s about using statistics to identify whether sentiment actually matters.

And then, if it does matter (statistically), we can then use that sentiment estimate to create a trading strategy.

For instance, if the returns of “more positive firms” are greater than those of “less positive firms”…

Then we can invest in more in firms with stronger positive sentiment firms and short, or sell, the shares of firms with weaker positive sentiment, for example.

Slide showcasing the fundamental idea of sentiment based analysis

Strictly, if the returns of more positive firms are statistically greater than the returns of less positive firms, then we could create a trading strategy that goes long in “more positive firms” and shorts “less positive firms”.

And again, the distinction between, or the classification of “more positive sentiment” and “less positive sentiment” is ultimately a function of the assigned sentiment score for each firm.

The same would naturally hold for something like “more negative sentiment” and “less negative sentiment” classification, for example.

Similarly, if you find for instance, that firms with narcissistic CEOs do better than firms with humble CEOs, then you can invest in firms that are led by narcissistic CEOs and short firms that are led by humble CEOs.

Or indeed, vice-versa, depending on what the data shows. So if, for instance, you find that humble CEOs on average tend to outperform narcissistic CEOs, then you could invest in firms that are led by humble CEOs and short firms that are led by narcissistic CEOs.

It’s not just about measuring sentiment

At this stage, don’t worry about how you actually go about measuring sentiment.

It’s just important that you understand that it’s not just about measuring sentiment.

It’s also about validating whether or not the sentiment actually matters.

And only then going into things like whether we can exploit it by creating a trading strategy.

Crucially then, the idea is to start with some sort of notion, or a premise, or an idea, or more formally, what’s called a testable hypothesis.

And then test or validate that hypothesis, to see whether or not it holds.

This approach applies to any scientific analysis, not just sentiment analysis. The best thing you can do is to start with a testable hypothesis and then test and validate whether or not that hypothesis holds.

In other words, let the hypothesis drive your sentiment analysis process.

The Fervent 5 Step Sentiment Analysis System

We think this is so important, that we actually created what we call a 5 Step Sentiment Analysis System to guide you along the way.

But we do want to be clear. Although we’ve created this 5 Step System, it’s important to know that the real world isn’t necessarily always so systematic and organised.

Things can in fact get messy.

But if we were to think about some sort of systematic approach to sentiment analysis, here’s what a 5 Step Sentiment Analysis System might look like.

Want to go beyond Sentiment Analysis?

Get the Investment Analysis with NLP Study Pack (for FREE!).

Investment Analysis with Natural Language Processing Study Pack Feature

Step 1: Start with a Testable Hypothesis

You want to start with perhaps the most important part – create a testable hypothesis.

If you’re not familiar with what a “testable hypothesis” is, in a nutshell, you can just think of it as a formalised version of an idea or notion that you’re looking to statistically test.

It’s just a way of formalising your beliefs or what you think might be true. And expressing it in a way that you can test empirically.

Step 2: Extract Relevant Data

Once you have a testable hypothesis, you can then think about extracting relevant data.

And in the context of sentiment analysis, because we tend to work with text data for the most part, the relevant data is some sort of Corpus.

Just in case you’re not familiar with what a “Corpus” is; a Corpus is essentially the entire sample of text data that you’re going to be working with.

Hypothesis driven choice

And of course, to determine whether the data is relevant, you really want to go back to Step 1, and let the hypothesis drive that choice.

So for example, if your hypothesis is whether more positive firms outperform less positive firms, then the data that’s most relevant is some sort of firm level data.

It might be annual reports for instance. Or it might be interview transcripts of the management or the CEO. It might even be a tweet on Twitter, or other social media posts and updates by firms.

But the bottom line is that it’s some sort of firm level data. It wouldn’t make sense to work with macro aggregate level data if you’re trying to gain an insight into whether the positivity of firms matter.

If on the other hand, your hypothesis is something about whether or not the economy is in a state of panic, then it’s unlikely that the most relevant data is firm level data.

Because we’re now thinking about things in the aggregate terms.

And so a large sample of news articles would probably be a better and more relevant data source of insight vis-a-vis firm level annual reports, for instance.

You might also argue that general / generic social media posts may also make a fairly good data source.

For example, you could use some sort of opinion mining / text mining to gain a proxy for the matcro economic effect via the social mediaposts.

Thus again, the key takeaway is that you want to let the hypothesis drive the choice of data that you work with.

Step 3: Clean Your Data

Now, once you have your relevant data, it’s a case of cleaning that text data.

And the importance of this particular step really cannot be emphasised enough.

There’s a term called “GIGO“, which is quite important in computer science / data science. And that stands for Garbage In Garbage Out.

So GIGO or Garbage In Garbage Out is essentially saying, if your data is garbage – if your data is rubbish; if the data’s not clean, or the data is not usable; or the data is fundamentally flawed – then it doesn’t matter how good your model is.

And it doesn’t matter how sophisticated your sentiment model is.

The results from your text analysis will almost certainly be garbage.

In other words, if the data is garbage, then the results are going to be garbage as well.

If the input is garbage, then the output is garbage. Garbage In Garbage Out. GIGO.

And so again, we really can’t stress the importance of cleaning the text data for any sort of text analysis.

As a result of cleaning the text data, the additional incremental benefit is that you really get to know and fully understand the data that you’re working with.

This in turn will allow you to conduct better, and richer text analytics. And gain much deeper insights from your data.

That’s something which is imperative to conducting any sort of half decent analysis.

Okay. Now you’ve created your testable hypothesis. You’ve let the hypothesis drive the choice of data that you work with. And then you’ve obsessively cleaned the data.

The next thing you can do is perhaps the most fun part. And in the context of sentiment analysis, it’s actually estimating the sentiment.

Step 4: Estimate Sentiment

It’s at this stage where you can actually quantify things like the positivity of firms, or negativity, or uncertainty, or indeed any other type of sentiment or emotion.

How do you actually estimate sentiment? Answering that is well worth an entire article.

Or in fact, several videos as part of a robust course.

We have both for you though.

This article discusses sentiment analysis in finance and talks about the two approaches to estimating sentiment including:

  • machine learning technique, and
  • sentiment lexicon / dictionary based approach

And our course on Investment Analysis with Natural Language Processing (NLP) shows you how to estimate sentiment from scratch. And a whole lot more, too.

Now, once you’ve got this estimate of sentiment for a firm, or a set of firms, or indeed the aggregate economy, you finally test and validate the original hypothesis.

Step 5: Test & Validate the Hypothesis

Now that you’ve got a measure of sentiment, you can empirically or statistically test and validate whether or not that measure of sentiment matters.

And once you’ve validated the sentiment measure, then – and only then – should you proceed to creating a trading or investment strategy.

Wrapping Up

Slide showcasing the Fervent 5 Step Sentiment Analysis Process

If any part of this article is not quite clear, especially the part about the five step process to conducting sentiment analysis, then please do read it again.

For now, though, just a quick summary.

We learned that sentiment analysis, at least in finance, essentially involves quantifying and then exploiting sentiment for some sort of investment objective.

The fundamental idea of sentiment analysis is to start with a testable hypothesis. A hypothesis on whether or not some type of sentiment matters. Then statistically test and validate that core hypothesis, before moving on to exploit it in a trading or investment strategy.

Lastly, of course, we talked about the 5 Step Sentiment Analysis Process.

It’s a systematic approach, or the ideal approach, you could use to conduct sentiment analysis in a rigorous and robust manner.

Importantly though, do remember that the real world isn’t quite so systematic or organized.

The real world is in fact chaotic. And that’s the beauty of working in the real world. That’s the beauty of working with real world data.

It’s chaotic, it’s messy, it’s exciting.

There’s a lot of uncertainty; there’s a lot of unknowns.

And you really do want to embrace and enjoy the chaos. But at the same time, you want to have some sort of order.

Thus, although conducting real-world analysis is quite messy, it’s important to have some sort of idea as to where exactly you are in the overall sentiment analysis process. Because otherwise, it’s akin to just running around like headless chickens!


Related Course

Do you want to build a rigorous investment analysis system that leverages the power of sentiment analysis?

Explore the Course

Filed Under: Finance, NLP for Finance

Reader Interactions

Leave a Reply Cancel reply

You must be logged in to post a comment.

  • About Us

Copyright © 2021, Fervent · Privacy Policy · Terms and Conditions


Logos of institutions used are owned by those respective institutions. Neither Fervent nor the institutions endorse each other's products / services.

We ethically use cookies to give you the best possible user experience as explained in our Privacy Policy. Find out more.