Automate Your Invoice Processing With Invoice Data Capture Software

Karthik Shiraly
August 26, 2022

Your business purchases goods and services from multiple vendors and receives their invoices. Unfortunately, most supplier invoices are designed for people and not software systems, posing several problems to your business operations and decision-making.

In this article, you’ll learn about the challenges of manual and template-based invoice processing and the tremendous advantages of automated invoice data capture over them. You’ll get to know all the features of Width.ai’s invoice data capture software and find out how your business can deploy our software in just six easy steps.

Challenges in Manual Invoice Processing

manual invoice processing challenges

Manual processing of invoices involves manual data entry into your accounting system by your internal invoice team, checking it for problems, confirming details with other departments, and reviewing by senior accounting roles or by management. This standard process comes with a number of downfalls seen to cause longer terms problems for small businesses.

  1. High human resources and effort: The manual approach requires a large support staff for data entry, multiple rounds of verification, and coordination. Naturally, this has cascading effects on other roles like management, HR, and payroll.
  2. High costs and poor ROI: The value brought in by simplifying just taking information from an invoice and logging it is pretty low ROI relative to what you have to pay to have this done. Many small businesses just have their accountant process these which can be even more expensive at the average hourly rate in the US.
Computing cost per invoice with a manual pipeline.
  1. Time-consuming: Faced with business challenges like demand fluctuations and agile competitors, manual data entry and verification just hamper your decision-making for your business.
  2. Error-prone: Manual invoice processing is prone to many kinds of errors — typos, mistaking date and currency formats of other countries, misunderstanding handwritten text, and more.
  3. Risk of souring relationships with vendors: The above challenges can cause additional problems like late payments or payment disputes with vendors and sour relationships with them.
  4. Cannot handle many formats: Vendor invoices take many forms — as paper through direct mail or fax, email attachments, digital files like PDF or image files, special formats like XML or electronic data interchange (EDI), or forms on invoicing portals for vendors. Handling each type manually using ad hoc processes can lead to coordination problems and errors.
  5. Requires physical storage: Paper invoices require resources for physical storage and organization, reducing available space and requiring more manpower.
  6. Complicates coordination: In some industries, multiple departments coordinate to verify and process invoices. Manual processing means paper invoices or digital files have to be passed around physically or by email instead of everybody seeing the same invoice through an online portal.
  7. Invoicing practices are slow: Manually conducting critical invoicing practices — like matching with purchase orders, bookkeeping, and ledger reconciliation — takes an unacceptably long time.
  8. Cannot detect mistakes and fraud quickly: Manually verifying and correlating information across multiple invoices is neither easy nor efficient for people. As a result, both unintentional mistakes and deliberate frauds can go undetected for a long time and cause problems later on during financial audits.
  9. Does not scale: All these challenges make for an approach that just can’t scale beyond a few vendors.

Challenges in Template-Based Invoice Capture

template based invoice processing
Template-based OCR labeling (Source: Label Studio project)

Given these problems with manual processing, most businesses use semi-automated invoice capture that combines invoice digitization, optical character recognition (OCR), and invoice templates.

Because invoices vary so widely in layouts and positions, you need a configurable way to make the software associate the text in an invoice with suitable invoice fields. Invoice templates solve this to an extent. An invoice template is a set of positional information and parsing rules that enable the software to transform the unstructured text from an invoice into structured data if that invoice matches that template. Some invoicing applications provide visual editors to create these templates easily.

Unfortunately, this approach also suffers several challenges that can’t be solved but are merely ameliorated using custom post-processing software and frequent manual intervention.

1. Cannot Handle a Large Number of Invoice Layouts

Given the enormous variance in invoice layouts out there, the template approach just doesn’t scale well beyond a point. Businesses that get hundreds of invoices from different vendors daily face scaling challenges:

  • Because these templates are not very flexible with their positioning and parsing rules, a new template is needed for every unique invoice layout.
  • Maintaining, searching, and naming hundreds of templates becomes a logistical nightmare.
  • As the template count increases, the OCR model loses accuracy. As you try to increase the data variance of your invoices, the overall accuracy goes down.
  • The effort required for creating invoice templates should be comparable to that for simple data entry. Otherwise, some teams may simply decide that it’s easier to process some invoices manually.
accuracy of ocr products on invoices

The accuracy of template and OCR based approaches does not reach an accuracy level that fits business use cases when you grow the number of templates. 

2. Low Tolerance for Noisy Data

Invoices in the real world can contain lots of noise that confuse the template approach:

  • Variations in positions of invoice fields
  • Text that is not useful
  • Perspective and lighting issues in invoice images
  • Handwritten text
  • Colors
invoice processing with templates

Even something as simple as moving the invoice date field around the key can create problems in template based processing. These OCR and template architectures do not learn a relationship between the key and the field which makes it very difficult to map these two together when the proximity and order changes. 

3. No Deep Understanding of Text and Fields

The template approach does not allow for a deep understanding of the relationships between the invoice text, positions, and fields. As a result, follow-up tasks that require such understanding, like invoice matching, also suffer from correctness problems.

4. Prone to Errors

issues with invoice processing templates

Older OCR approaches use only visual features like contours for text recognition. As a result, typos and wrong numerical values are common. This problem is worse when processing paper invoices and handwritten text.

Due to the unpredictable nature of these errors, businesses have to recheck everything manually and are forced to employ support staff in this approach too.

5. Does Not Support Automated Mapping to Ledger Codes

These semi-automated approaches are not intelligent enough to automate mandatory practices like mapping invoice items to general ledger codes.

6. Invoice Matching Is Not Reliable

Because they lack semantic understanding of the data, such approaches cannot match purchase orders, invoices, and receipts reliably. Manual intervention is necessary. 

7. Employment and Other Costs Are Still High

The semi-automated approaches reduce some expenses compared to manual processing. But those savings are lost again when hiring support staff for other tasks like template creation and manual verification of values.

8. Only Slightly Better at Detecting Mistakes and Fraud

Semi-automated approaches improve the chances of detecting mistakes and fraud across invoices. But only slightly, because they don’t understand the invoices semantically and can’t match data across invoices as people can.

9. Limited Customization Support

Lacking any semantic understanding of invoices, semi-automated approaches just can’t support a high level of customization beyond invoice templates and some settings.

7 Business Benefits of Automated Invoice Processing

invoice data to json

What is the state of invoice processing out there? Consider these alarming statistics:

  • 40% of invoices are still paper-based. 
  • 75%+ of respondents said their accounting departments received at least one suspicious invoice.
  • 58% said they experienced vendor fraud.
  • 71% of accounting professionals said a lack of visibility in the invoicing workflow left them vulnerable to cyberattacks.
  • 44% of finance professionals said a lack of automation will harm their companies’ futures.

Fully automated invoice data capture is the solution to all these challenges. From the same study, companies with fully automated invoicing “were 63% more likely than average to feel very confident in their fraud prevention” and 72% of finance leaders said, “it would make their companies less susceptible to ransomware attacks and other issues.” 

Fully automated invoice data capture uses machine learning to accurately identify invoice elements such as prices, names, quantities, products, and so on in any invoice regardless of its layout and other specifics.

It extracts that information as structured data that can be sent to downstream systems like enterprise resource planning (ERP), customer relationship management (CRM), and internal databases. Automated invoice data capture brings tremendous benefits to your business.

1. Slash Invoice Processing Costs and Increase ROI

width.ai costs vs template based

Unlike both manual and template-based approaches that impose high employment and overtime costs, automated invoice capture software requires minimal support staff and human intervention. Over time, even that reduces as the software automatically learns your invoicing needs. Costs on other downstream tasks are also reduced. 

Automation brings a very high ROI. It reduces labor costs for invoicing by a third. Plus, 74% of chief financial officers CFOs at firms with annual revenues between $1.5 billion and $2 billion felt that digitization improved their balance sheets.

2. Get Highly Accurate Results

Unlike traditional OCR, this approach combines visual features and natural language models using deep learning for true entity understanding. It can correctly identify invoice fields, amounts, quantities, and dates based on context and values. It is largely immune to typos and misidentification of characters. The high data accuracy brings significant downstream benefits to your business:

  • Reduces the risks of financial losses from wrong amounts.
  • Avoids problems during financial audits.
  • Reduces the chances of disputes, late payment penalties, and legal actions from vendors.
  • Enables your business to keep an accurate record of its cash flows and finances at all times.

3. Automatically Detect Mistakes and Fraud

Semantic understanding, high accuracy, and automated invoice matching enable real-time detection of any incorrect entries, duplicate entries, and deliberate fraud in the invoices. This is something even manual processing by experts cannot do efficiently. This benefits your business by avoiding financial losses due to fraud. You’ll also avoid problems and penalties during financial audits, which are particularly problematic for your business reputation if you’re a publicly listed company or undergoing due diligence for an acquisition.

4. Process a Wide Range of Invoice Variations Without Invoice Templates

Instead of relying on rigid rules and template matching like template-based approaches, this approach understands the semantics of invoice data the way people do. It can handle invoices having any layout, lighting conditions, formatting, and other variations without requiring any invoice templates at all. As a business:

  • You don’t need to assign staff for creating and maintaining invoice templates.
  • You don’t have to impose any invoice format rules on your vendors, which should make them happy and improve your vendor relationships.
  • You’ll experience a reduction in invoice rejections due to formatting mistakes. This will improve your standing with your vendors and avoid payment penalties or misunderstandings.

5. Increase Processing Efficiency and Reduce Time

Automated invoice capture that matches human-level understanding can streamline every step of the process. It can accurately extract invoice data from even the most complex invoices within seconds. They significantly reduce time, effort, and money compared to manual data entry and double-checking.

6. Integrate Seamlessly With All Your Invoicing Practices

integrate your business processes easy

Semantic capturing of invoice data enables you to automate many of your downstream business practices too:

  • Automatically map vendors to their account numbers and line items to correct general ledger codes without any manual effort.
  • Intelligently match invoices to associated purchase orders and receipts without human intervention.
  • Automatically send important and high-amount invoices for management approval.

7. Get Real-Time Financial Data for Your Business Intelligence, Procurement, and Marketing Teams

These teams are critical to your data-driven decision-making. With automated invoice capture, they can get up-to-date cash flow, lifetime value, and the financial state of your business in real-time at any time. In one study, 95% of CFOs said automation played a very important role in maintaining healthy balance sheets. Moreover, unlike manual or template-based approaches, this information is both reliable and up-to-date. It can help you make decisions about pricing, vendor contracts, and deals based on the latest reliable data.

Features of Our Automated Invoice Data Capture Software

width.ai invoice processing

Width.ai’s automated invoice data capture software brings all the above benefits to your business and comes with compelling features.

Handles Any Invoice Layout Without Invoice Templates

The software uses state-of-the-art deep learning to understand invoices the way people do. This deep, human-like, semantic understanding of invoices enables it to process any invoice layout and extract data accurately. We’ve used the same underlying deep learning pipeline for information extraction in legal document cover sheets and supported over 50+ different layouts.

Fully Automated

Our software provides fully automated, hands-off invoice data capture that automatically fetches invoices from multiple sources, processes them in bulk, extracts their data accurately with minimal human monitoring, and exports the data to multiple formats or systems. Your invoice can go from uploaded to processed with extracted data in a matter of seconds.

Efficient and Scalable

Our software processes even complex multi-page invoices with handwritten text in seconds with astonishing accuracy. It can process millions of invoices every day.

If you have an archive of historical invoices (even handwritten or typewritten ones), we can provision additional cloud resources to process them quickly. Historical invoices can help your business intelligence teams detect long-term trends.

Supports a Large Number of Invoice Formats

Our software supports a large number of invoice formats:

  • Digital invoices from popular invoice management systems: We support built in integrations for any layout from popular invoice management systems like FreshBooks, QuickBooks, Zoho Books, Xero, and Pilot. It can fetch invoices using their APIs and process them without any configuration.
  • Digital invoices from accounting systems: It supports invoices from accounting systems like SAP FICO using their APIs.
  • Digital formats: It can process invoice files stored in digital formats like PDF, XML, EDI, and image formats (including low resolution and noisy photos). It supports periodic fetching of files from your internal network storage and cloud storage services like AWS S3.
  • Emailed invoices: It can monitor mailboxes for invoices received as email attachments in one of the supported digital formats. It periodically fetches emails and processes any attached invoices.
  • Paper invoices: It can process paper invoices received by snail mail or fax and digitize using scanners or smartphone cameras.
  • Handwritten invoices: It can recognize handwritten text using visual features and language models to accurately guess what’s written just like people do. Any handwritten details, corrections, or additions are automatically processed and included in the relevant invoice field.

Supports a Large Number of Export Formats and Systems

The software supports automatically pushing the grabbed invoice data to your ERP accounting systems like SAP FICO, CRM, database, or email. 

It can also export to multiple digital formats like PDF, Excel, Javascript Object Notation (JSON), CSV, and many more.

You can set up custom export workflows that select invoices using custom criteria (like invoice amounts or vendor names) and export the extracted data to multiple destinations or approval workflows.

Highly Customizable

The entire pipeline is fully customizable. It can be fine-tuned for your specific invoices to further increase accuracy.

Supports Automated Invoice Matching

In addition to invoices, our software also understands purchase orders and receipts semantically. It can do three-way matching between purchase orders, invoices, and receipts automatically.

Supports Management Approval Workflows

Our software supports routing invoices for management approval based on invoice amounts, vendor names, or other criteria.

Deploys Very Accurate Field Value Matching

field matching with width.ai

Our software builds a deep understanding of the relationships between invoice text, positions, semantics, and extracted fields. It can identify fields and values accurately even when field labels are missing. It has built-in support for over 50+ common invoice fields.

Captures Additional Information

In addition to the common fields, your business may want to extract other important information from invoices. For example:

  • If your vendors specified payment terms different from the standard "net 30" (i.e., full payment in 30 days), they should be reported to your accounting department to avoid late payment penalties.
  • If you're a government agency, you may expect your vendors to follow special rules for their invoice numbers, and you may want to automatically validate them.

Our software has features like language comprehension and post-processing actions to handle such special needs. For example, its natural language capabilities enable it to identify a sentence containing payment terms and file it against the payment terms field.

Captures Custom Fields

Our software supports over 50 of the most common invoice fields right out of the box. But it also allows you to add the custom fields you want. We do all the fine-tuning necessary to add these new fields to your pipeline.

The custom fields we have added using these techniques include:

  • Mapping items to appropriate general ledger codes using text classification
  • Classifying paper invoices by industry and vendor
  • W-9 forms
  • Due dates
  • Different addresses and emails

Automatically Applies Customizations to Any Invoice

Customizations done on one vendor's invoice can be applied to all invoices, or a subset of invoices, across any vendor. Our system automatically searches for semantically similar information in every invoice (including old invoices) to populate the correct fields.

Includes Alerting Features

Our system can send status and progress alerts to your staff through:

  • Email
  • Slack
  • Jira
  • CRM
  • PagerDuty

Deploy Our Invoice Data Capture Software in 6 Easy Steps

Your business can start using our feature-rich invoice software in just six easy steps.

Step 1: Plan the Integration Into Your Business Workflows

For a seamless transition of your business processes to our invoice processing software, we first assess aspects of your current invoicing workflows, like:

  • What are the sources and formats of your invoices? Do you get any paper invoices, perhaps by fax or mail? Do you need invoice scanning software?
  • Do you have invoices in an invoice management service like QuickBooks?
  • Which invoice fields should be extracted?
  • To which formats should the captured data be exported? Do you want the data sent to your ERP, CRM, or database?

These details help us plan the integration and deployment of our software for your business. For example:

  • For a large enterprise with lots of general ledger codes, we configure our software’s language-understanding capabilities to automatically identify the correct code for each line item.
  • If you have thousands of historical invoices, we deploy additional cloud resources to capture them quickly.

Step 2: Set Up Our Invoice Capture Software to Read Your Invoices

With the assessment done and a plan ready, the next step is configuring the software to handle your invoice sources and formats.

Digital Invoices

For invoices from FreshBooks, QuickBooks, Zoho Books, Xero, Pilot, or SAP FI, add them as sources and provide your authentication credentials.

S3 Stored Invoices

Connect your s3 storage system to automatically process invoices.

Emailed Invoices

For invoices received as email attachments, configure the software with authentication credentials and monitoring intervals for relevant mailboxes. The software periodically fetches emails and processes any attached invoice that’s in a supported format.

Paper and Handwritten Invoices

For paper invoices received by mail or fax, set up an invoice scanning pipeline using scanners or smartphone cameras to digitize them to PDF or image formats. The software can handle low resolution and noisy photos too. Store the digitized files in your internal network storage or cloud storage like AWS S3. The software automatically fetches new files from those locations periodically and processes them.

Step 3: Tune Our Software on Your Invoices

tune our software

The ability to understand any kind of invoice layout, extract data accurately, and add custom fields are all indispensable capabilities of an excellent invoice data capture software, and ours has all of them.

Right out of the box, our system understands every invoice layout produced by popular invoicing software like FreshBooks & Quickbooks. Our built-in models also understand a wide variety of other layouts with common invoice fields, but your business may receive unique invoices or need custom fields. For such cases, we fine-tune our built-in deep learning models on your actual invoice and receipt samples. This is a big deal as it enables us to fully customize our state-of-the-art architecture into a customer-specific pipeline for your unique business requirements.

Step 4: Configure Captured Invoice Storage

width.ai output

Our software can export the captured invoice data to a variety of formats, storage locations, third-party software, and external workflows. You can set up custom workflows that select invoices based on custom criteria and export their extracted data to multiple destinations or approval workflows.

Step 5: Start Capturing Your Invoices

Once you have configured the software and fine-tuned it, it's ready to capture your invoice data in bulk with minimal manual intervention.

Under the hood, the software uses a state-of-the-art deep learning architecture. It’s trained on thousands of invoices from popular invoice management systems like FreshBooks and QuickBooks. The training enables it to detect text, recognize the characters, and map invoice elements to appropriate fields. To do this, the system learns from the visual and linguistic characteristics of these elements, like:

  • Semantic meanings
  • Positions relative to each other
  • Positions on the page

We clone our latest model and fine-tune it on the invoices you provide in the third step to get a model that's customized to your invoices. Your preferences, like custom fields, help us refine this custom model even more to identify the exact data you want accurately.

This model scans each invoice for visual and linguistic characteristics. The combinations of characteristics help it identify an element as an address, a purchase order, an invoice number, and so on.

The results of this phase are a set of fields and their values for each invoice. This extracted data is forwarded to export pipelines for generating reports in various formats or for export to an accounting system.

Step 6: Measure and Monitor the Software

text recognized on receipts

Metrics like accuracy, precision, recall, and F1 scores enable you to evaluate the effectiveness of your fine-tuning. Confidence scores for each processed invoice enable you to pinpoint problematic invoices and fine-tune the software on them.

Alerts enable your employees to monitor progress in real-time and to know about any problems that crop up during processing. Our software supports alerting through email, Slack, Jira, your CRM, and PagerDuty.

Automate the boring and time consuming stuff out of your business with invoice data capture software

You have seen how our invoice data extraction software brings an array of incredible benefits and features to your business. For any other special requirements, we bring expertise in developing accurate information extraction systems using the latest artificial intelligence and deep learning innovations. We can customize our invoice capture software to your specific business requirements and use cases. Contact us for a demo of our invoice data capture software.

width.ai nlp consulting services