• Skip to main content

Fervent - Learn with Distinction

Rigorous Courses, Backed by Research, Taught with Simplicity.

  • Home
  • Courses
  • Private Tuition
  • Articles
  • Contact Us
Introduction to NLP for Finance

Introduction to NLP for Finance

December 9, 2020 By Vash Leave a Comment

In this article, we’re going to get an introductory overview into Natural Language Processing or NLP for Finance.

So let’s get into it.

What is Natural Language Processing (NLP)?

Firstly, what is NLP?

Well, ultimately it’s just a set of techniques which help us gain insights from text data.

Or for that matter, any other type of language, data; for instance, voice.

Slide showcasing what NLP for Finance is

Ultimately the idea is to use these set of techniques to try and gain insights – preferably actionable insights – or to try and gain value, from language data.

Or indeed, from unstructured data in general.

And for the most part in Finance, at least today, when we think about language data, we typically work with text data.

But it wasn’t always like this in finance.

NLP for Finance – A Brief History

Historically, academics and practitioners in finance have largely relied on numerical data for investment analysis.

And this ranges from something as simple as ratios to more advanced portfolio optimisation techniques.

But the idea is, regardless of which aspect of finance you look at, be it investment analysis, be it financial modelling or financial analysis, or capital budgeting…

Regardless of which concepts or areas you look at… for the most part, people have worked with structured numerical data.


Related Course

This Article features a concept that is covered extensively in our course on Investment Analysis with Natural Language Processing (NLP).

If you’re interested in leveraging the power of text data for investment analysis, you should definitely check out the course.


Text Data in Finance

Now this wasn’t because we didn’t have a lot of text data / unstructured data in finance far from it.

In fact, finance has so much text data, that few fields can actually compete with that sort of volume.

Predominantly relying on numerical data instead of text data was largely because analysing these large volumes of text data was extremely time consuming and cumbersome.

Large sizes of unstructured content

To give you just a minuscule idea of the sheer scale of text data that’s available in finance…

Back in 2015, the Wall Street Journal reported that the average annual report or 10-K had about 42,000 words.

And this was in 2013.

That was up from roughly 30,000 words in 2000.

To put this in perspective, the Sarbanes Oxley Act of 2002, which was this really massive piece of legislation that came about as a result of scandals like Enron and WorldCom and all the other corporate scandals during the.com era.,,

Well, that massive piece of legislation had approximately 32,000 words!

Annual reports today, which is something that firms have to publish every single year, at least back in 2013, they had about 42,000 words on average.

And the size is not really getting particularly smaller today.

Importantly, of course, if you’re thinking 42,000 words is not all that much; this is just an average.

So you’ll find plenty of annual reports that have hundreds of thousands of words.

And of course you will find some annual reports that have tens of thousands of words.

But the point is that this is for a single annual report.

And firms listed on the financial market / stock market need to publish these annual reports every single year!

So just take a single firm and, say you’re looking at 10 years worth of data. And the average number of words is 42,000.

Well, you have 420,000 words to analyse now.

So good luck if you’re doing that manually!

I wouldn’t be keen and quite frankly, very few people working.

And this is why until fairly recently, these really massive volumes of text data in finance, which have potentially so much value in them, were just left untouched.

Technical Jargon

Of course, the size isn’t the only factor that meant people weren’t analysing these reports.

For instance, the CFO of GE, Jeffrey Bornstein was taken aback by the sheer size of their own annual report!

Their annual report was about 110,000 words long. And he himself suggested that not a single retail investor on earth could get through it, let alone understand it.

And in terms of this latter part year… this “understanding these annual reports”; that’s ultimately because annual reports tend to have a lot of technical jargon that not a lot of people actually understand.

And this is not limited to just retail investors.

Although mutual fund managers and hedge fund managers and pension fund managers may not openly admit it…

Not all of them necessarily understand what all these annual reports are on about.

Because sometimes they just have terms that one might not have come across.

Want to go further?

Get the Investment Analysis with NLP Study Pack (for FREE!).

Investment Analysis with Natural Language Processing Study Pack Feature

Why use NLP for Finance?

The point is, academics and practitioners didn’t really work with text data in finance, despite there being so much text data, partly because of course of the technical jargon involved, but largely because of the sheer size of the alternative data.

Which meant of course, analysing all of this text data manually was simply not feasible.

Fortunately, though, thanks to major advancements in NLP technology, particularly thanks to computational linguistics, it’s now significantly easier to analyse insanely large volumes of text data. The so-called “Big Data”.

But it’s not just about more than just analysing this text data. It’s ultimately about gaining actionable insights or value from that text data.

Slide showcasing why using NLP for Finance makes sense

Applications of NLP for Finance

And if we think about the applications of NLP for Finance… they’re fairly extensive.

They’re certainly increasing.

And I think, with time, they’re only going to get bigger and better.

Specifically though, while the applications of NLP for Finance are fairly wide in their scope, we think we can broadly categorise them into three different types.

Applications in Context

The first of which is Context

This is about using NLP techniques to try and gain context from text data in finance.

For example, it’s a case of using Topic Modelling algorithms to try and establish the context of news articles or firm announcements, business descriptions, annual reports, and a whole host of other “Big Data” or “Big Text Data” in Finance.

It’s a case of using these machine learning / artificial intelligence algorithms in unsupervised settings to try and establish the themes or topics that are being discussed or talked about in these various different kinds of text data.

So that’s context.

Applications in Compliance

Then there’s Regulatory Compliance, which focuses on things like detecting insider trading or detecting and preventing fraud within the financial services / financial industry in particular.

And it’s doing so using unique sets of data; for instance, emails or indeed chat transcripts inside firms.

Generally speaking, NLP application in regulatory compliance will require internal unstructured content instead of external ones like earnings calls transcripts, for example.

Applications in Quantitative Analysis

And lastly, there’s the case of NLP application in Quantitative Analysis.

For instance, one majoy NLP application involves creating trading strategies, using “sentiment analysis“.

This involves firstly estimating the sentiment that firms may display, using unstructured data like annual reports, earnings calls transcripts, social media posts, etc.

And then using that sentiment to create trading strategies.

Your biggest takeaway from this article should be that Natural Language Processing (NLP) allows us to really leverage the power of text data and work on interesting problems in Finance.


Related Course

Do you want to build a rigorous investment analysis system that leverages the power of text data with Python?

Explore the Course

Filed Under: Finance, NLP for Finance

Reader Interactions

Leave a Reply Cancel reply

You must be logged in to post a comment.

  • About Us

Copyright © 2021, Fervent · Privacy Policy · Terms and Conditions


Logos of institutions used are owned by those respective institutions. Neither Fervent nor the institutions endorse each other's products / services.

We ethically use cookies to give you the best possible user experience as explained in our Privacy Policy. Find out more.