Cut Invoice Processing Time to Just 3 Seconds With This Invoice OCR Tool
iscover an invoice OCR tool that will revolutionize the way you handle invoices. There’s no human intervention needed & a dramatically lower per-invoice cost.
GPT-3 is one of the most versatile and transformative components that you can include in your framework, application or service. However, sensational headlines have obscured its wide range of capabilities since its launch. Let’s take a look at the ways that companies and researchers are achieving real-world results with GPT-3, and examine the untapped potential of this 'celebrity AI'.
The interpretive powers of the GPT-3 autoregressive language model stirred up popular tech headlines when it was unveiled by OpenAI in 2020. It could apparently compose feature articles, write poetry, and even talk to the dead.
GPT-3 has perhaps been a victim of its own publicity; it's a vanguard product in the NLP space, but widespread public interest in the possibility of an effective AGI (which GPT-3 is not), combined with tech media's determination to coopt any new AI product into its annual round of febrile headlines, has left GPT-3 rather misunderstood.
Subsequently, its core capabilities have inspired a slew of startups across a range of sectors.
Here we'll take a look at the usable scope of GPT-3 for business purposes, and at some of the companies that have taken up the vanguard in this respect.
First, let's a look at the strengths and weaknesses of the various methods by which GPT-3 answers a prompt.
Models available for transformations in GPT-3 include Davinci, Curie, Babbage and Ada, each of which have different capabilities in terms of speed, quality of output and suitability for specific tasks.
For instance, Davinci is the most sophisticated of the available models, and is most likely to produce usable output that's more complex, and that explores (or at least appears to explore) higher-level domain thinking and analysis (later we'll also take a look at Davinci Instruct, which is capable of following more specific commands regarding the formatting and domain-specificity of its output).
However, some of the leaner and less computationally demanding models are more than adequate for simpler prompts, saving latency and API request costs. For instance, novelist Andrew Mayne has found that much of the most wide-spread knowledge available to GPT-3 is accessible across lower-level models than Davinci:
Treated as a straightforward 'global oracle', GPT-3 is subject to the same inaccuracies and inexactitudes as the publicly available content that it was trained on, and the depth and truth of its responses on any subject is in proportion to the subject's representation (and misrepresentation) on the internet, and in the standard datasets that informs it.
For instance, Davinci knows a little about Charles Dickens' output, but will give up if you second-guess the first response:
Answering the same question 'Who wrote the novel Bleak House?', the other models seemed less well-read, even on multiple attempts:
But within more narrow domains, usually away from subjective arts and culture topics, GPT-3 proves extraordinarily adept.
The GPT-3 model was developed through unsupervised training, wherein it had to discover, rationalize and categorize vast volumes of data without being explicitly told what any of it 'meant'. In the course of training it was necessary for the model to distil labels from the available entities – such as picture captions and descriptions – that accompanied the content, or that had some kind of discoverable relationship with it.
Where a subject has a rigid and agreed taxonomy and hierarchy (such as mathematics), it's much easier to be certain of the relationships between sub-entities in that domain, and to create generative rules that are accurate.
Programming falls into this category, and GPT-3 is very good at it.
Considering that even professional coders may have eccentric solutions to programming challenges, and that the development community frequently comes into conflict regarding the correct answer to a problem, it's not clear why GPT-3 errs so little when writing code — unless, perhaps, that it might be the only coder who took the time to comprehensively RTFM.
It's also possible that in the lonely course of its training, the model has assigned higher weight to inbound links to 'best answer'-rated posts at sites such as StackOverflow, in much the same way that a search engine's algorithm will calculate authority from frequent citations.
However, GPT-3 is often not the only contributor to those cases where it has won headlines for its programming prowess.
In the case of Sharif Shameen's Debuild, which leverages the OpenAI model to provide instantaneous app development from text input, the user-facing framework incorporates GTP-3, but was also specifically and intensively trained against content from GitHub and StackOverflow.
Here, GPT-3 handles the interpretive and transformational aspect of NLP requests, but will avoid wide-ranging answers in favor of focused, domain-limited responses that are (apparently) guided by the weights of a separately trained model.
There is a well-established market for any product or process that can explain legacy code – live software that may now have multiple dependencies across version-locked VMs, and which in some cases may have roots as far back as the 1970s.
Technical debt of this nature can force companies to retain otherwise unproductive staffers who have become the gatekeepers of arcane but essential systems, or necessitate massive consultation bills if the originating developers should die or depart the company without adequately documenting their sprawling projects.
The Replit Code Oracle uses GPT-3 to explain what complex code will actually do.
The company, which is based in San Francisco and raised $20 million USD in series A funding in February 2021, has even recruited GPT-3 to explain Replit's intent to 'rewrite all the rules of software development' by making natural language the default interface to code generation.
Since Replit has not released details of any co-training that contributes to its code-explanation algorithm, I tested the Replit challenge in the above image in the basic OpenAI GPT-3 playground, and found that the Davinci and Babbage engines can calculate a correct answer by default, without additional domain-specific training:
In developing a complex code interpreter with GPT-3, it is important to remember that where a problem has not been directly assimilated from dataset material at training time, GPT-3 would likely fall back to a text analysis and semantic deconstruction of the written challenge, in the absence of examples in its database that match a particular text pattern.
However, since GPT-3 is very capable of deconstructing query and request logic based on similar taxonomies in a programming language, its 'logical guesses' are well-informed, and likely to be correct.
Research into text-to-SQL predates the advent of GPT-3; but since standard SQL query language is in itself designed to conform to natural language, GPT-3 is very good at reproducing functional SQL requests:
More complex SQL queries are far less human-readable, and nested conditional phrases make mistakes more likely when humans are writing out the commands. Due to the inflexibility of the SQL language, and its restricted taxonomy, GPT-3 has assimilated SQL brilliantly.
New York-based analytics startup SeekWell has adopted GPT-3 as an SQL 'translation' layer.
The company has noted that the Davinci instruct model performs best at SQL queries, since it allows the user to provide specific guidance, such as 'Only respond in correct SQL syntax'.
Currently in closed alpha, OthersideAI's email generation system uses GPT-3 to generate email responses based on bullet-points that the user provides.
Entitled 'Quick Response', the synthetic replies take into account the content and context of other emails in the chain, but seem to transform the responses primarily based on GPT-3's understanding of business email style.
The AI can infer the temporal tense from brief prompts, for instance turning 'good to meet' into 'It was good to meet you'.
However, it’s not clear whether Quick Response could draw on earlier emails that contain meeting details to develop a response such as 'It was good to meet you for coffee Tuesday at Starbucks on 1st St.', or whether it understands how mechanistic and data-stuffed a response like that would seem, and would favor a conversational style instead.
Quick Response is posited as the first stage in a multi-AI communications framework wherein, effectively, two people may exchange information and ideas programmatically and without generating the text content themselves. It's analogous, arguably, to two business entities communicating through their lawyers instead of directly.
At the time of writing, the wait list for the OthersideAI email generation API stands at 15,000+.
Other email-based GPT-3 tools have arisen: Lean To provides a dedicated user interface for AI-based email generation.
When setting up a new project (i.e. an email to single or multiple participants), you can choose the objective of the email, even if it is just 'getting a response', and can import contacts via a CSV file, or add them manually.
As is often the case, the GPT-3-based interface requires a lot of prior information. In this case I chose to write to a Stanford Vision Lab professor, and had to do a fair bit of research in order to fill in all the necessary fields.
The results are unconvincing, and generic enough to likely trigger spam filters, while subsequent attempts do not improve the quality. Despite having put research leads into the contact details (such as a website for the recipient), there's no evidence that any of this information was used.
Bar the odd missing comma, the basic GPT-3 playground does a far better job via the Davinci instruct engine:
The lack of client integration in Lean To is also a hindrance; though the email can be sent through Lean To's own SendGrid system, this would make replies problematic, and increase the chances of the message being automatically flagged as spam. Therefore it's better to just copy and paste the generated output into your own email environment.
Lean To's wizard-driven interface attempts to deconstruct the semantically important parts of a GPT-3 prompt and reassemble it into the 'ideal' prompt (which is not shown to the user), rather than writing it naturally; but 'raw' GPT-3 does a better job of this deconstruction by itself, so long as you choose the right engine and phrase the prompt succinctly.
Ultimately, the value of a GPT-3-derived email generation system is in being able to use it directly in an email client, since, as we've seen, it is easy enough to appropriately prompt GPT-3 in any testbed and ask it to generate an email.
Once GPT-3 has inferred the semantic intent of a text, it can rephrase that content in a number of ways, including ELI5. This is very useful for locating the devil in the details of verbose EULAs and general legal documentation:
GPT-3 has a mode called 'summarize for a 2nd grader' that makes a broadly fair assessment, for instance, of the dense EULA for the videogame Far Cry:
Despite the context of 'high-precision language', legal terminology often contains ambiguous terms that seem designed to produce a 'chilling effect' without specifying what a term signifies, or what a breach of it might entail. Since such clauses are designed to be tested in court, neither a person nor an AI can hope to extract their exact meaning, because, in the context of the document, that has yet to be established.
In the above case, GPT-3 has sided with the cynics, and assumes that paid upgrades will be necessary in order to continue to use later versions of the videogame (rather than any sequels to it). However, GPT-3 has not correctly understood that this probably applies to cases where the existing game is subsequently ported to next-gen platforms, and must be paid for all over again, because the original terminology ('certain hardware') is very vague.
The GPT-3 summary also states that the user can't play the game on any other computer, which is usually not true. Here GPT-3 seems to have mistaken the intent of 'non-transferable' (which means that the user cannot re-sell the license to someone else, and not that the user cannot de-authorize the game on one device and re-authorize it afterwards on another).
In fairness, a lot of actual people would struggle to define the meaning of this term, in this context.
It seems likely that a GPT-3-based framework for decrypting legal jargon would benefit from cross-training against a dedicated dataset of legal terms, where manual labeling could eliminate some of these ambiguities, or at least give GPT-3 enough confidence to declare that certain legal terms may not be clear in their intent, and to outline some possible meanings for them, based on historical examples.
Though the AI-generated content sector was already populated with platforms such as Articoolo, Article Forge and Phrasee, GPT-3's text-generation and semantic capabilities inspired a slew of new AI content startups around the time of its launch, and since.
Some of the more notable names include CopySmith, WriteSonic, Headlime and PepperType. Though we don’t have space here to trial all of the new GPT-3 content platforms, we'll shortly look at a couple of the most popular.
In a world where an inappropriate or inconsiderate tweet can end a career or tank a company's stock, there's no realistic prospect that short-form social media output from GPT-3 will ever become entirely programmatic.
However, this more succinct kind of auto-generated social media output is what the current crop of companies are really pushing, perhaps because GPT-3 can produce very apposite results in such a short window of attention.
In the long term, the value of GPT-3 copy generation frameworks may lie partly in the (nascent) potential to write more substantial works, such as coherent blog posts (or at least, in providing better article-spinning transformations than are currently available); in producing corrected or alternative versions of shorter, human-generated social media snippets; and in generating correct English language short-form output for users that have a different native language.
Founded in October 2020, CopySmith leverages GPT-3 to provide a suite of templates to massage existing copy and generate new text-based material from limited prompts.
CopySmith offers text-completion, rewriting and generation services for a range of contexts, including:
…and many more besides. However, in practice nearly all of these possible use cases are minor variations of four basic NLP capabilities in GPT-3:
CopySmith also has a blog post generator, which is currently in beta – but you'll need to 'get out and push' quite a lot:
As with CopySmith's rivals, it's not possible to prompt the blog writer with 'Write me an article about this widget company'; rather, the module focuses on user input, and needs a meaty minimum of 120 words to get started, as well as the initial sentence of the subsequent three paragraphs. However, it is possible to generate these requisites in other CopySmith modules, and then input them at this stage.
I filled in the initial paragraph without giving it any real information about the product, besides the brand being advertised and some general information about the target age-ranges for various products, and tested the blog-creation interface:
The texture of the post is consistently upbeat, as one would expect, but the module struggles for continuity of theme, and invents product features and purposes out of whole cloth. Even within the individual paragraphs, where one would expect reasonable continuity, the text struggles to stay within its own proposed themes, and the net effect is incoherent.
CopySmith is planning to develop SEO principles into its GPT-3 output, as well as paired training on industry-specific topics, which may help some of the 'focus' issues in longer sections of text output.
UK-based Writesonic offers a very similar range of services to CopySmith. The beta article writer, like CopySmith's, requires hand-written prefacing text, between 100-150 words.
Initially, the system requires a target topic:
Subsequently you'll receive ten article ideas:
After this, it's time – inevitably – to provide a detailed corpus of words from which the article will be manufactured:
Having chosen one, you can then choose from ten proposed article outlines, where every two sections cost one credit:
Finally, you'll get a completed article:
Given that the text prompt was incredibly generic, and designed to challenge the AI, Writesonic has produced a surprisingly coherent article from the draft outline, working productively through the header topics to an industry-standard closing paragraph, including a call-to-action.
WriteSonic gives you 10 free generation tokens, but just one attempt at an article is likely to burn through all of them, and so it is very difficult to give the product a second chance if it disappoints.
Though the GPT-3 article writing system AI Writer claims that the content-theft sentinel CopyScape identifies 94.47% of its articles as unique, CopyScape could not find any matches for the final article texts generated from my experiments with CopySmith and WriteSonic.
One word of (anecdotal) warning: when GPT-3 includes interview quotes in its article-style output (i.e., when I have asked for text, but not specifically for quotes or for interview material), I've found, in my two months of experimenting, that not even one of those quotes was available on the internet.
Since GPT-3 is trained on public-facing documents that are likely to be indexed in major search engines, this has led me to believe that although GPT-3 can find apposite interview material, it may currently be reluctant to quote a source verbatim.
I tested this theory in the GPT-3 playground, asking Davinci Instruct 'What did Einstein say about dice?', expecting to receive the famous quote 'God does not play dice with the universe.'
An exact-text search on Google indicates that either Einstein never said any of these things, or that if he did, the internet doesn't know about it. These are, at best, paraphrases — a core function of GPT-3.
Further submissions were unhelpful:
This would be less of a hazard if GPT-3 was consistently wrong about quotes; however, it does get many famous sayings correct:
Watch out as well for persistent session variables in a GPT-3 consultation:
Here GPT-3 is 'on a roll' – in an effort to maintain continuity and return improved results as the session develops, it has trained on my previous prompts, and failed to notice that I am now looking for Einstein quotes, and not (as with the previous prompt) for Winston Churchill quotes.
Shortly after the release of API access, New York-based product designer Jordan Singer developed a GPT-3 plugin capable of passing instructions to the proprietary vector graphics editor Figma.
Called Designer, the plugin communicates with the Figma work space, converting descriptions into vector representations of a described shape. Among its other programming capabilities, GPT-3 has assimilated the postscript language, which represents geometry by systematically describing it in a mathematical space. Designer converts plain text instructions such as 'draw a red circle 500px' into a corresponding postscript command, and then wraps it as a JSON to pass it to Figma.
GPT-3 is capable of generating ideas for company startups, with minimal prompting.
In December of 2020 one outlet published a list of genuine startups, mixed in with fictitious startups suggested by GPT-3 (which had been briefly trained on suitable real-life examples in the course of an API session), challenging readers to distinguish them.
Though the best results were included in the feature, the unedited list of GPT-3 suggestions is available on Google Docs, together with the 'real world' examples that informed them. Many of the ideas are imitative of real proposals, while others lack any particular 'angle'.
The IdeasAI site allows subscribers to receive GPT-3 startup ideas, which are generated daily, in an email. However, once again, the pitches are frequently vague, and some of them are simply random excerpts from related web articles.
Nonetheless, it’s reasonable to assume that dedicated training and more targeted data sampling may hold considerable potential for generating more innovative and off-beat startup concepts across a number of sectors.
The kind of domain understanding that GPT-3 offers makes it possible to perform 'elliptical' searches, where even if none of the words or phrases in a query are found, their intent can be discerned, and relevant results obtained.
Algolia's suite of commercial search tools incorporates a number of machine learning technologies, including GPT-3, to facilitate semantic search on websites, or on any corpus of information that can be indexed.
There's nothing new about semantic search in itself – In a 2013 update to its search algorithm, entitled Hummingbird, Google began to leverage its own semantic Knowledge Graph, which had debuted the previous year, and which operates similarly to GPT-3 in this aspect.
Though the search giant did not entirely abandon keyword/keyphrase frequency as an index of relevance (and has not abandoned them since), it was clear that the future of search might lie in generalized domain understanding, where a lexicon of related terms and images would constitute a 'cloud' of essential information on a topic; and not in merely parsing text and counting occurrence frequencies.
What is new is that accessible and increasingly affordable machine learning infrastructure makes it possible for smaller players to implement truly effective semantic search, either for the purpose of developing new generalized search engines based on web-crawling, or for more specific functions, such as recommender engines.
The above movie and TV recommendation system utilizes GPT-3 and metadata from the IMDB.
The system can also identify a movie from a description that may contain few or no keywords in the target entry:
For entry-level startups, the democratizing power of AI-driven semantic search via GPT-3 is perhaps one of the most exciting aspects of OpenAI's language model.
GPT-3 is a powerful tool that, intelligently applied, can help create genuinely useful and innovative new products and services that were not financially or technologically feasible before.
Though its core strength lies in its power to manipulate domain knowledge (without entirely abstracting it), some of the critical publicity it has received since launch has instead focused on what it does not yet do well, such as generating long and coherent treatises with minimal supervision or input.
However, that's a far broader challenge for the NLP community, and one that's ongoing; nor did GPT-3 ever advertise itself as a novelist or long-form feature writer, though it does have some nascent ability in these fields.
Plugging a raw GPT-3 API call into a new product will work best if your service concentrates on those aspects that GPT-3 can perform well without additional training, such as programmatic analysis. However, there's no real value proposition in discovering and re-packaging GPT-3's native capabilities, since a) it's trivial for any competing product to do likewise, and b) native calls can still be unpredictable, even when querying in a limited domain.
The real potential of GPT-3 is as an adjunct or enabling technology for proprietary systems that train GPT-3 against other dedicated datasets that support your product — as evidenced by the best of some of the services that we've taken a look at in this article.
iscover an invoice OCR tool that will revolutionize the way you handle invoices. There’s no human intervention needed & a dramatically lower per-invoice cost.
Instead of invoice matching taking upwards of a week, it could take mere seconds with the proper automation solution. Learn more here.
Manual and template-based invoicing are riddled with low accuracy and required human intervention. Learn how to systematically eliminate these issues with the right invoice data capture software.
A complete walkthrough guide on how to use visual search in ecommerce stores to create more sales and real examples of companies already using it.
Automating the extraction of data from invoices can reduce the stress of your accountants by finding inaccuracies, digitizing paper invoices, and more.
How you can use machine learning based data matching to compare data features in a scalable architecture for deduping, record merging, and operational efficiency
Learn how lifetime value or LTV prediction can improve your marketing strategies. Then, discover the best statistical & machine learning models for your predictions.
A deep understanding of how we use gpt-3 and other NLP processes to build flexible chatbot architectures that can handle negotiation, multiple conversation turns, and multiple sales tactics to increase conversions.
The popular HR company O.C. Tanner, which has been in business since 1927 and has over 1500 employees, was looking to research and design two GPT-3 software products to be used as internal tools with their clients. GPT-3 based products can be difficult to outline and design given the sheer lack of publicly available information around optimizing and improving these systems to a production level.
We’ll compare Tableau vs QlikView in terms of popularity, integrations, ease of use, performance, security, customization, and more.
With a context-aware recommender system, you can plan ways to recreate some of the contextual conditions that persuade them to buy more from you.
We’re going to walk through building a production level twitter sentiment analysis classifier using GPT-3 with the popular tweet dataset Sentiment140.
Find out how machine learning in medical imaging is transforming the healthcare world and making it more efficient with three use cases.
Discover ways that machine learning in health care informatics has become indispensable. Review the results of two case studies and consider two key challenges.
Accelerate your growth by pivoting key areas of your business to AI. Your business outcomes will be achieved quicker & you’ll see benefits you didn’t plan for.
We built a GPT-3 based software solution to automate raw data processing and data classification. Our model handles keyword extraction, named entity recognition, text classification | Case Study
We built a custom GPT-3 pipeline for key topic extraction for an asset management company that can be used across the financial domain | Case Study
How you can use GPT-3 to create higher order product categorization and product tagging from your ecommerce listings, and how you can create a powerful product taxonomy system with ai.
5 ways you can use product matching software in ecommerce to create real value that raises your sales metrics and improves your workflow operations.
Data mining and machine learning in cybersecurity enable businesses to ensure an acceptable level of data security 24/7 in highly dynamic IT environments. Learn how data security is getting increasingly automated.
Product recognition software has tremendous potential to improve your profits and slash your costs in your retail business. Find out just how useful it is.
Big data has evolved from hype to a crucial part of scaling your organization in every modern industry. Learn more about how big data is transforming organizations and providing business impacts.
Learn how natural language processing can benefit everybody involved in education from individual students and teachers to entire universities and mass testing agencies.
Here’s how automated data capture systems can benefit your business in some key ways and some real-life examples of what it looks like in practice.
Use these power ai and machine learning tools to create business intelligence in your marketing that pushes your business understanding and analytics past your competition.
We built a custom ML pipeline to automate information extraction and fine tuned it for the legal document domain.
In this practical guide, you'll get to know the principles, architectures, and technologies used for building a data lake implementation.
Find out how machine learning in biology is accelerating research and innovation in the areas of cancer treatment, medical devices, and more.
An enterprise data warehouse (EDW) is a repository of big data for an enterprise. It’s almost exclusive to business and houses a very specific type of data.
Save yourself the hassle of manually importing and processing data with intelligent document processing. Learn all the details of how it works here.
Dlib is a versatile and well-diffused facial recognition library, with perhaps an ideal balance of resource usage, accuracy and latency, suited for real-time face recognition in mobile app development. It's becoming a common and possibly even essential library in the facial recognition landscape, and, even in the face of more recent contenders, is a strong candidate for your computer vision and facial recognition or detection framework.
Learn how to utilize machine learning to get a higher customer retention rate with this step-by-step guide to a churn prediction model.
Machine learning algorithms are helping the oil and gas industry cut costs and improve efficiency. We'll show you how.
We’ll show you the difference between machine learning vs. data mining so you know how to implement them in your organization.
Here’s why you should use deep learning algorithms in your business, along with some real-world examples to help you see the potential.
Beam search is an algorithm used in many NLP and speech recognition models as a final decision making layer to choose the best output given target variables like maximum probability or next output character.
Best Place For was looking for an image recognition based software solution that could be used to detect and identify different food dishes, drinks, and menu items in images sourced from blogs and Instagram. The images would be pulled from restaurant locations on Instagram and different menu items would be identified in the images. This software solution has to be able to handle high and low quality images and still perform at the highest production level, while accounting for runtime as well as accuracy.
Deep learning recommendation system architectures make use of multiple simpler approaches in order to remediate the shortcomings of any single approach to extracting, transforming and vectorizing a large corpus of data into a useful recommendation for an end user.
Let's take a look at how you can use spaCy, a state of the art natural language processing tool, to build custom software tools for your business that increase ROI and give you data insights your competitors wish they had.
The landscape for AI in ecommerce has changed a lot recently. Some of the most popular products and approaches have been compromised or undermined in a very short time by a new global impetus for privacy reform, and by the way that the COVID-19 pandemic has transformed the nature of retail.
Extremely High ROI Computer Vision Applications Examples Across Different Industries
Building Data Capture Services To Collect High ROI Business Data With Machine Learning and AI
Software packages and Inventory Data tools that you definitely need for all automated warehouse solutions
Inventory automation with computer vision - how to use computer vision in online retail to automate backend inventory processes