NLP in Finance for Banking and Finance Professionals

Karthik Shiraly
August 21, 2023
Annual filing (Source: SEC)

Modern developments in artificial intelligence (AI), deep learning, and natural language processing (NLP), such as the advent of large language models like GPT-4 and ChatGPT, are extremely useful in the finance industry to increase revenues, automate processes, and streamline workflows. Understand more from this deep dive on the applications of NLP in finance.

NLP in Finance Document Processing and Automation

NLP can be very useful for a variety of roles and tasks in the finance sector. A few examples include:

  • Financial analysts: A financial analyst can use NLP tools on unstructured data from annual filings and earnings calls to help businesses make decisions about investments, mergers and acquisitions, stock market trends, and more.
  • Accountants: Accountants can use NLP for automated invoice processing and matching.
  • Investment bankers: Investment bankers can use NLP on financial reports and prospectuses to help businesses either raise capital through IPOs or invest in another business.
  • Risk manager: A risk manager can use NLP to identify events, news, or regulatory changes that may impact the organization including financial, market, operational, and other risks.

In the following sections, we help you understand, in-depth, how you can apply NLP in finance.

Use Case #1: Financial Document Summarization

In financial services, analysts have to regularly read long-form financial reports running into hundreds of pages to gain insights. For example, each annual report of Silicon Valley Bank, a bank that went bust recently, contains about 200 pages of explanations and tables. Reading them page to page takes a lot of time and effort. Perhaps, if the information was more accessible and queryable, external analysts would have been able to notice an increased probability of bankruptcy and send out warnings.

For such long-form financial documents, you can use modern NLP models to summarize all the important information while retaining key details verbatim, all within a few seconds. This mix of abstractive and extractive summarization is called blended summarization and is very useful for any use case where key details must be kept unchanged.

Long-Form Document Summarization Using GPT

The biggest limitation of the GPT models is their caps on the number of tokens. GPT-3 models were limited to about 4,000 sub-word tokens while GPT-4 models support 8,000-32,000 tokens. Any document that exceeds those limits must be broken into smaller pieces (known as chunks) and sent to GPT separately.

But doing so creates new problems like loss of context or repeating information which leads to inaccurate summaries. To solve these issues, we use a custom GPT-based pipeline (shown below) that offers clever solutions like chunking and prompt optimization.

Long form summarization pipeline from Width.ai

Let's understand its special components and capabilities.

1) Chunking Algorithm

The chunking algorithm's job is to split long documents with minimal loss of context, while retaining all its fine-grained, specific context. That means, the pipeline should not split on clauses and paragraphs that supply important context to their neighborhoods. At the same time, retention of context must be optimized because the longer the input prompts get, the higher the risk of the model ignoring too much context and producing bad summaries.

So given a document, the pipeline decides the optimum balanced size for each chunk and, using custom logic, decides how much of context from the earlier chunk must be included. Next, it builds up a chunk-specific custom prompt for GPT that has been found to produce a better-quality summary.

2) Dynamic Prompt Optimization

dynamic prompt optimization width.ai

The GPT models produce better results when we prefix the actual input chunk in the prompt with examples of input text and ideal summaries. This is called few-shot learning. Additionally, if the prompt examples are also made relevant to the input text, the generated summaries are of much better quality.

The task of pulling in the most relevant examples for each input text is called prompt optimization. For that, we maintain a database of gold reference prompt examples and summaries. Given an input text, we use a semantic similarity algorithm to pick the most relevant prompts and summaries and inject them in the prompt dynamically. This ensures that the few-shot examples in the prompt are highly relevant to the input text and summarization task.

3) Blended Summarization

Blended summarization is achieved by using the right dynamic prompts for each input chunk with the right prompt examples to show the GPT model that we want abstractive summaries for some sections and extractive for others. A database of prompts specific to finance or banking is maintained for this purpose.

4) Chunk Combinator

combinator model width.ai for gpt-4

An output algorithm recombines the summaries generated for each chunk such that they don't feel choppy to a reader but instead feel contiguous with good flow.

Use Case #2: Information Extraction from Financial Documents

Finance, accounting, and banking professionals often need information from complex documents to make decisions on investments, loan approvals, trade reports, and other financial operations. Such information includes:

  • Key entities like company names, company officers' names, addresses, and similar
  • Financial details like revenue or profit numbers
  • Financial data in tabular format, like quarterly earnings
  • Key phrases and conditions

Let's see some practical applications.

Example 1 — Information from Certificate of Incorporation Documents

Finance professionals may need to analyze Certificate of Incorporation (COI) documents to understand a company's legal structure, capitalization, and key business details for investment, evaluating initial public offerings, or due diligence prior to a merger or acquisition. Reviewing such documents can also help finance professionals to ensure compliance with laws and regulations.

The important information in such documents includes:

  • Details about a company's legal structure
  • Information about its authorized capital, types of shares, and shareholding patterns
  • Details about the board of directors and officers
  • Information like company name, date of incorporation, and registered addresses required to establish a company's legality for loan approvals
  • Details necessary for due diligence checks and third-party vetting, such as beneficial ownership information

Such information can be identified and stored in a database using automated information extraction pipelines that accept digital formats or scans of these documents and accurately extract all the essential information using AI, computer vision, and NLP. We'll explain how it works in a later section.

Example 2 — Information from Invoices and Purchase Orders

ocr extraction
Invoice extraction with custom positional OCR framework

Another common application is information extraction from invoices and purchase orders so that accounting professionals can either automate or optimize tasks like:

  • Invoice reviews and approvals
  • Three-way invoice matching
  • Data entry into accounting and enterprise resource planning (ERP) systems
  • Invoice analytics
  • Fraud detection

Example 3 — Loan Application Automation

ocr extraction
Text fields recognized and drawn around in the W-2

A third application is the automation of your business workflows, such as loan approvals. Such automated pipelines deliver enormous savings on manual labor and costs. They also inject some neutrality in the evaluation process to help prevent fraud and collusion.

How Automated Information Extraction Works

document processing pipeline

Intelligent document processing pipelines enable your business to reliably automate the understanding of complex documents and extraction of information from them instead of using manual labor or unreliable semi-automated, template-based approaches. Regardless of the process you're trying to automate, these pipelines all work the same way with the following five stages.

1) Data Capture

Raw documents — with different formats like portable document format or image formats — are received, stored, and pre-processed to prepare them for the rest of the pipeline.

2) Document Understanding

This is the primary stage where all the information extraction happens, with the following steps:

  • Text extraction: Text content is extracted using format-specific techniques. For images, either optical character recognition (OCR), or OCR-free techniques like the document understanding transformer, is used to localize the text.
  • Document classification and layout analysis: The content and locations of the text are used to identify the type of the document.
  • Information extraction: Document fields and their values in the text are identified with the help of language models or large language models like GPT-4. Named entity recognition identifies people, companies, locations, and similar information. Domain-specific key phrases may also be identified by using the ability of language models to find semantically similar phrases.
  • Fine-tuning on custom document data: Domain-specific entities are identified by fine-tuning the model on custom document data. For example, a person mentioned in a loan application may be identified and labeled as the beneficial owner.
  • Additional classification: Any domain-specific NLP techniques like sentiment analysis or opinion analysis can also be applied.

3) Information Validation and Evaluation

The extracted information is passed through rule engines that check the fields based on business policy rules. If it passes all the rules, it may be passed to more sophisticated statistical models. For example, a loan application may be sent to a loan default risk model or interest projection model to estimate its associated business value and risk.

4) Information Storage

The extracted information is stored in external systems like databases or ERP.

5) Business Process Integrations

The extracted information is supplied to, or fetched by, other business workflows and acted upon. For example, based on loan default risk models or other business data analytics, loan applications are prioritized and sent for manual reviews and approvals by loan officers.

Use Case #3: Financial Risk Assessment

regulatory alerts and advisories
Regulatory alerts and advisories (Source: FinCEN)

Regulatory compliance is one of the most challenging tasks in finance and banking. Non-compliance or mistakes in compliance carry risks like civil penalties, criminal prosecution, and damage to reputation. But compliance isn't simple. The difficulties include:

  • A mind-boggling set of complex federal and state regulations: A financial institution in the U.S. has to always comply with a huge array of regulations like the Bank Secrecy Act anti-money laundering rules, the Foreign Corrupt Practices Act, the Sarbanes-Oxley Act, and perhaps at least a dozen more legislations, regulations, or sanctions orders.
  • A large number of regulators: The number of regulators and departments is equally impressive, including the Department of the Treasury and its units like the Financial Crimes Enforcement Network (FinCEN) and the Office of Foreign Assets Control (OFAC), the Department of Labor, the Department of Commerce, the Department of Justice, the Federal Reserve, the Federal Deposit Insurance Corporation, and many more.
  • Reams of textual data: The documents that compliance officers must read include the United States Code, the Code of Federal Regulations, the state codes of the states they operate in, frequent bulletins or guidance documents from the regulatory agencies, case law, and more.

From a business perspective, compliance is often seen as a cost center with few benefits. So achieving full compliance with minimal cost and effort is a desired goal of all businesses. But its inherent and emergent complexities make it a difficult goal to achieve in practice.

The volume of documents being put out and constantly changed means that there is always niggling uncertainty over becoming inadvertently non-compliant simply due to ignorance about some minor change in some document. So compliance departments have no choice but to read them all.

AI and NLP can help with that and reduce compliance costs by reducing the time and labor expended on document reading and comprehension. In this section, we'll explain how NLP pipelines can automatically deduce whether a regulatory change is relevant to your business and notify compliance officers about the change in real time. Once notified, they can use question-answering chatbots to seek answers to complex questions about the regulations, a use case we cover later on.

How Automated Regulatory Relevancy Checking Works

Example of GPT-4 evaluating complex financial regulations against business details pulled from ERP
Example of GPT-4 evaluating complex financial regulations against business details pulled from ERP

The main question — "Does this clause of this regulation apply to my business?" — can be treated as a semantic similarity problem.

On one side is your business with various aspects like:

  • Your portfolio of financial products and services, possibly very dynamic
  • The jurisdictions in which you conduct business
  • The demographics of your customers or clients
  • Your assets and liabilities

On the other side are the laws, regulations, and guidelines of all those jurisdictions with a complex set of conditions to decide who is regulated, when, and how.

So the semantic similarity problem is to take those complex regulatory conditions and match them against the aspects of your business. Most of these aspects will be in your ERP. A secondary semantic similarity problem is to detect whether a regulatory clause has changed between the last time it was fetched and now.

One thing GPT models are very good at is intelligently judging semantic similarity without requiring any hardcoded rules or thresholds like in traditional NLP. We can basically supply the text of the regulatory clauses along with the business aspect values from your ERP and ask GPT if anything matches. Every match is a regulation that's potentially relevant to your business. You can then dig deeper with the help of a question-answering chatbot (covered in the next section).

Similarly, GPT is used to detect changes between previous regulatory text and current text to find deltas. If there's a delta, task frameworks like LangChain can be integrated with the GPT model to notify relevant compliance officers in relevant jurisdictions.

Use Case #4: Custom GPT-4 Chatbots for Finance and Banking Documents Question-Answering

In this section, we explain how GPT chatbots are used for helping employees understand complex financial documents.

The Problem

Example banking regulation (Source: CFR)
Example banking regulation (Source: CFR)

Finance professionals often have to read long, dense documents like compliance regulations, annual reports, financial analyses, and investor reports. Likewise, banking professionals have to wade through documents like investment reports, compliance regulations, and corporate loan applications.

Reading them requires concentration and attention to detail because a lot of important information and answers are scattered over hundreds of pages. These professionals may have specific questions in mind, but the structures of those documents force them to expend a lot of brain cycles and time (and time is money) on reading and comprehension to find the answers.

GPT Chatbots Solve it Efficiently

Chatbots exponentially streamline this by easily digesting thousands of pages of documents in seconds and providing accurate, human language answers to complex questions about those documents.

Chatbots are not new, but users trust them only when they can give accurate and complete answers to complex questions every time. That's always been a sticking point with traditional chatbots, even those like Google's Dialogflow driven by older AI.

But chatbots backed by the awesome power of large language models like GPT-4 are a different breed altogether. Trained on vast quantities of real-world documents, they achieve unparalleled levels of semantic comprehension, abstraction, and accuracy.

Business benefits of using GPT chatbots include:

  • Huge savings in effort, time, and money
  • Highly accurate answers to questions with attention given to every trivial condition that the user has mentioned in their questions
  • Reduced risks of missing out important and relevant information
  • Proficiency in multiple languages, which is essential for multinational companies
  • Always up-to-date information, which is critical in dynamic areas like regulation enforcement
  • Better work-life balance for finance and banking employees

GPT-Based Banking Chatbot in Action

The example below shows one of our banking chatbots in action answering complex questions with natural language answers:

width.ai gpt-4 bank chatbot
Customer facing bank chatbot from Width.ai

GPT Chatbot vs. Dialogflow or BlenderBot

Frameworks like Google Dialogflow and Meta's BlenderBot are around to help create AI chatbots. But they have the following drawbacks compared to custom GPT chatbots:

  • Limited flexibility: Dialogflow expects you to plan the interactions beforehand and design structured querying using forms. But when professionals are looking for answers in complex documents, the interactions may be unpredictable and open-ended. Are they looking for specific one-word answers? Are they looking for an abstract, conceptual understanding of something? Will the conversations take multiple unpredictable turns because the user has many questions? All these are out-of-the-box capabilities of GPT chatbots but not Dialogflow.
  • Limited conversational abilities: GPT has better conversational abilities than the others. Conversational ability includes human-like talk as well as the ability to retain and use context and details from previous messages.
  • Limited multilingual support: GPT has officially demonstrated proficiency in dozens of languages while anecdotal reports on social media claim proficiency in hundreds. In contrast, social proof for multilingual proficiency in the other frameworks is not as visible and may turn out to be a risk for multinational businesses.

Width.ai Custom GPT Chatbot Architecture

A custom architecture to turn a GPT chatbot into one capable of understanding complex internal business documents and answering complex questions is shown below:

width.ai dynamic chatbot architecture

We'll walk you through the high-level steps that go into readying your information desk chatbot.

1) Information Extraction From Your Documents

First, we collect all the useful documents like regulations, FAQs, and knowledge base articles that contain useful information for your employees. This information isn't specific to a user but something that's applicable to everyone, like in this example:

Example bank information content for chatbot
Example bank information content for chatbot

Potential questions and Informational facts are extracted from such content using manual annotation or web scraping. They go into a knowledge bank and provide the details and context required for the answers.

2) Prompt Optimization and Placeholder Variables

Prompts are key to getting the most out of GPT. When we build GPT chatbots, we must frame the relevant details and questions in particular ways for GPT to interpret them correctly. We can provide a few examples of ideal prompts and answers (few-shot learning) to GPT so that it can dynamically figure out what's expected of it based on the patterns in the examples.

We do this by maintaining a database of gold-standard prompts and answers. When a customer query is received, we dynamically select the most relevant examples from that database and prefix them to the customer's query before asking GPT. This helps GPT interpret the query correctly and return high-quality responses.

Placeholder variables are another important aspect of this phase. Instead of hardcoding details and links, we train GPT to output placeholders. These variables are replaced later with country-specific or department-specific information to provide personalized answers. For example, currencies and policy pages may be different in each country. So we ask GPT to generate placeholders for them instead of hardcoding a currency or page link.

In the example below, GPT outputs a placeholder for the link to a pricing page which will be replaced later in the pipeline:

GPT outputs a placeholder variable
GPT outputs a placeholder variable

3) Fine-tuning GPT

In addition to few-shot examples and prompt optimization, another step that can potentially improve the quality of answers is fine-tuning the GPT model by supplying a dataset of questions and answers. Fine-tuning essentially creates a custom GPT model, stored on OpenAI's systems, that's available only to your company. It's a good approach if the nature of information, prompt syntax, and answer formats are very different and domain-specific compared to the standard text generated by GPT.

4) Information Retrieval With Vector Embeddings

The essential idea here is that given a customer query, look up our database of extracted questions and answers, find the question that is most similar to the customer query, and return the associated answer for that extracted question.

To implement this idea of finding the most similar question, we convert all questions and queries to math forms called embeddings. They are essentially vectors that encode various linguistic and contextual information as numbers. Once converted to vectors, we can use math techniques like cosine similarity to find a question vector that is similar to a customer query vector. The answer associated with that question vector is then the most relevant answer to the customer's query as well.

For converting questions and queries to vectors, we use a model called Sentence-BERT from the sentence transformers library. It provides excellent results for such similarity tasks. We can further fine-tune it to achieve very high semantic relevance using our domain-specific question-answer datasets.

5) Vector Database

In the previous step, there are likely to be thousands or even millions of questions and answers. So a system that can store millions of vectors and calculate similarities quickly is necessary. Such systems are called vector databases, and Pinecone and FAISS are some popular options.

Unleash the Power of Large Language Models and NLP in Finance

In this article, we showed how large language models can streamline a variety of tasks in finance and banking. While these models are already very capable, the ability to build custom models and pipelines from them boosts their capabilities even more. Here at Width.ai, we have years of expertise in customizing and fine-tuning large language models and other machine-learning algorithms for multiple industries.

Contact us to find out how we can help bring the power of large language models to your business!