Blog Posts

Integrating Automated Invoice Processing OCR Into Your Business Instantly

Matt Payne
·
March 10, 2022
Automate your invoice processing workflow to remove manual data input with our custom built invoice processing OCR software. 


Invoice processing OCR (optical character recognition) software allows you to automate the process of handling an invoice or receipt, extracting key data, inputting the data into the proper database locations, and sending documents wherever they need to go. Old school automated systems use rule-based engines and pattern matching to try to extract the required data, or require specific templates that are the only invoice formats the system works with.


As you can imagine rule-based engines, open source OCR tools, pattern matching, and exact invoice formats do not allow you scale your automated invoice processing OCR technology at all! Any time we want to make changes to invoice format or image quality (normally seen with receipts) we have to make adjustments to our system. Although these old school systems are better than manual invoice processing, the automated invoice processing systems available today have been around for awhile and don’t get the job done.


Accuracy of Leading Products on the Market

The accuracy that the most popular OCR tools can achieve just does not work for a business setting. The most accurate OCR API from these large providers is Google Vision coming in at just 80% accuracy. From there it’s a quick drop off to 65% from Microsoft Cognitive Services & an abysmal 21% from AWS Rekognition (Source). 

marketplace accuracy of leading ocr products

On top of that you need to invest in custom training and deep learning architecture to improve the accuracy for your specific use case. Given the fact that automated invoice processing is more than just extracting the text from the document but understanding entities and key data points we’re going to need something more than available OCR models. 


Even if we can extract the text with open source OCR we have no way of knowing what raw text is associated with what entities

Low Customization From Available Products

Depending on what your invoice processing use case is used for requires custom steps to be built in around the OCR and deep learning models. Not only do these steps allow you to automate business steps specific to your needs but do so in a way that does not affect the performance of the deep learning and OCR models. These models are not built to handle the business processes and require more software development to reach a workflow that works for you. 


You’ll want an automated invoice processing workflow that can:


  1. Provide the same or better accuracy as manual human data extraction.
  2. Works with a wide range of invoice formats, receipts, or purchase orders and does not lose accuracy.
  3. Provide accuracy and confidence metrics to allow you to use human intervention when needed.
  4. Structure the extracted data from the OCR in a data format that works with your business systems.
  5. Automated monitoring and notifications that keep you in the loop.
  6. Allow you to add data to the model over time to continuously improve the accuracy for your specific use case.
  7. Be deployed in the cloud and run instantly when a new document or receipt is received.
  8. Give you the flexibility to add new data entities and fields that matter to you. 
  9. Process invoices in seconds.
  10. Remove manual data entry for invoice data.

Width.ai Automated Invoice Processing OCR With Deep Learning

We’ve built a custom deep learning based pipeline that allows you to automate your receipt and invoice processing OCR instantly on a huge range of formats with the highest accuracy available. This pipeline becomes a module in your workflow process that allows you to customize exactly how you go from input document to extracted and stored data. Don’t worry about digitisation, standardization, or anything else that slows down your workflow. We allow you to go from unstructured documents to extracted data with labeled entities instantly with our state of the art algorithms. 

electronic invoice to extracted json data with ocr
From electronic invoices to extracted data instantly. 


text recognition with ocr
Our invoice processing OCR extracts everything from any format and uses deep learning to connect the text to specific fields.


receipt ocr and text recognition
Receipts, Paper Invoices, Purchase Orders - All supported with state of the art accuracy


Our deep learning models work out of the box for a huge range of invoice and receipt formats and has been specifically trained on invoice examples from the 10 leading invoice processing companies including:


  1. Quickbooks
  2. Zoho Books
  3. Xero
  4. Pilot
  5. Freshbooks


By focusing on the most popular invoice types and real examples from businesses we’ve been able to build a system that produces high accuracy right out of the box and allows enough flexibility in its architecture to be adjusted for specific business use cases. Our deep learning expertise mixed with an understanding of company SLAs allows us to perfectly design a flexible solution that fits right into business workflows. 


Worried Your Invoices Are Different?


Worried your exact invoices or receipts don’t work with the default invoice processing OCR or that the information is too hard to extract from the documents (bad handwriting, poor lighting etc)? Do you have custom fields that aren’t supported by out of the box solutions? As you can imagine by the number of different use cases possible this is pretty common. 


Full Customization Past Generalized Models

Width.ai will fully customize the default invoice processing models with fine tuning on your actual data. By showing the models your exact invoices and the fields you care about you can steer the models towards your specific use case and boost the default state of the art accuracy through the roof! This fine tuning is pretty standard and the customization is something we fully recommend to help you reach the highest accuracy. 


invoice example for invoice processing ocr software
Tailor your solution for specific vendor invoices. 

Add Your Fields Instantly

While our model supports over 50 of the most common fields right out of the box, we’re well aware that many use cases have extra data fields to cover. Through our fine tuning process you can add any number of fields to extract that show up in your documents. These fields are added to the model output instantly in a few easy steps and can be extracted in JSON or table format. 


Our models go way past simple text extraction and OCR for invoice field extraction. Deep learning models added to our pipeline allow us to add reasoning and entity relationships to the equation to extract deeper fields and information. Some of the fields we’ve extracted in a custom setting include:

1. Question and answer pairs in an invoice.

2. Multiple languages in the same document.

3. Handwritten instructions

4. Address

5. Emails

6. Dates and times

7. Produce a document summary (like this)

8. Classification of paper invoices


Custom Integrations To Focus on Automation

Manual data entry and cost reduction are by far the most valuable benefits of invoice processing software with ocr technology. The ability to automatically extract the information you care about from these documents gives you hundreds of paid hours back with the exact same accuracy or higher than humans. 


We’ve built custom integrations to the tools you currently use for this process to allow you to fully automate each step. Grabbing the documents, scanning them in, extracting the data, storing the data, monitoring and alerting - all covered in a pipeline that runs in a few seconds not minutes. 


5 Easy Steps To Integrate OCR Invoice Processing

Here’s a 5 step guide to how you can integrate this system into your business and start automating your manual invoicing and document management processes. 


Initial Setup - Use Case Understanding & Workflow Design

Width.ai automated invoice processing
OCR Invoice Processing Pipeline


The first step in integrating OCR invoice processing is to understand the different requirements for your use case and how they fit together. This can usually be done without a full understanding of what the actual invoice processing module looks like, as it’s more important to understand inputs and output requirements such as:

1. What fields you need to extract

2. How your specific system needs the output (Your CRM, JSON, DB, ERP System)

3. What alerts and confidence metrics you care about

4. How many invoices you process per month

5. How many historical invoices you have (Can be zero!)


By understanding the requirements for coming into a system and what you expect out it’s much easier to gameplan the part in the middle! Sometimes this process is as simple as putting on paper the manual steps you currently work through and how each one of them can be automated. For instance manually plugging in the product information from an invoice into Quickbooks becomes extracting fields with Width.ai and automatically updating your backend Quickbooks via API. 


Invoice Capture & Input Processing

Deciding how you want to pass your invoices or receipts into the system based on how you currently store them is a huge part of the workflow design. Although the actual invoice processing OCR models will process a single document at a time, the software can be deployed to batch documents and run much higher volumes at a single time. 

input processing from invoice processing software
How we intend to pass documents into the processing system. 

Output Processing Of Invoice Data

Output processing may seem like a part of the process that isn’t particularly complicated or important, outputs just get passed to our target system right? While it can be as simple as creating a CSV and storing it in a database, oftentimes businesses want an output processing system that provides so much more than that.

output processing for width.ai invoice processing
Exporting data based on your system. This includes both the extracted fields and any metrics or notifications

Width.ai Invoice Processing Module Setup

Now that you’ve got a high level overview of what you’re looking for the system to accomplish, what goes into it and what you need out of it we can take a look at how to structure the invoice processing. First let’s lay out a few important requirements to note:

1. Do we have fields that are not supported by the default state of the art architecture?

2. Should we finetune the architecture to upgrade the accuracy for our specific use case?

3. Width.ai uses cloud based architecture to deploy your invoice processing models with high runtime speeds.

4. Do we have past examples of invoices or receipts that we can use to quickly boost results?
5. Are we leveraging Width.ai’s custom NLP pipelines in our use case?

Understanding where you want to go with these helps ensure that the production process achieves the best results. 


Setup Output Processing

Keeping up with your invoice processing software is way more than just monitoring if the system is online. Modern deep learning based systems offer way more insight into every piece of what is happening 24/7 in your system. You can quickly integrate alerts, confidence scores, and notifications into:


  1. Email
  2. Slack
  3. Jira
  4. CRMs
  5. Pagerduty


Alerts

Receive real time alerts for anything from system downtime to low accuracy results. Alerts can be integrated into a number of different systems to help you stay on top of your powerful ocr invoice processing. 


Confidence Scores

NLP model confidence scores and accuracy

Use in-house built confidence scores designed to help understand computer vision and natural language processing models in production. These scores work to understand the confidence based on known entity reasoning which far outperforms raw OCR confidence. 


Notifications

Notifications sent right to slack or email to let you know documents were processed successfully.


Deploy Invoice Processing Solution

Our cloud infrastructure allows you to run documents through in seconds not minutes. Deploy our default out of the box model or custom solution with an API that allows you to access the workflow from any web application or backend system. 


Deployed along the advanced ocr invoice processing solutions is an optimization system that automatically improves your pipeline over time with more data. This allows the deep learning models to become more attuned with your specific use case and be constantly improving. 


Reap The Benefits

You’ve deployed invoice data capture software that allows you to remove manual data entry and reduce costs by over 50%. Once your system is up and running it’s even easier to make changes such as add fields or new integrations for input or output. 


text extracted from quickbooks invoice
Quickbooks Invoice Template



Why Width.ai?

The level of customization we offer for our invoice processing software mixed with the incredible accuracy our models provide gives you results you have to chase with other systems. Raw OCR based systems don’t give you the accuracy when looking to automatically extract data and prebuilt solutions won’t provide you the customization that real production systems require. Our accuracy at all stages of the process not only outperforms open source tools but also outperforms other prebuilt solutions that don’t allow for customization. 

Are these accuracies really enough?


Setup A Demo

Setup a demo to see how you can use Width.ai invoice processing!