A Deep Guide to Text-Guided Open-Vocabulary Segmentation
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Large language models like OpenAI's generative pretrained transformers (GPT-4) are finding new uses everyday in various industries. Due to their training on huge volumes of real-world text data, these models are highly capable at natural language processing (NLP) and text-generation tasks. Out of the box, they're able to both interpret and generate varieties of high-quality sales content. In this article, find out how to use GPT-based NLP in sales for increasing your conversions and customers.
How can GPT NLP techniques help people in sales roles? Here's a preview of the possibilities:
Solo entrepreneurs and small business owners can also improve their conversions using these GPT NLP techniques. In the next sections, we dive deeper into some use cases to help your sales professionals understand GPT's capabilities in depth.
Let's start by using GPT chatbots as shopping assistants for e-commerce use cases.
A great shopping experience is when your customers can find exactly what they have in mind. Most shops have a search box and search filters for the purpose. However, this typical search interface makes for some inconvenient and inefficient user experiences, as explained below.
Search filters give customers fine-grained control to drill down their choices. Just look at the number of filters Amazon shows for a customer looking for jeans:
But making decisions with so many checkboxes, sliders, color selectors, and "See more" links is just plain exhausting. Changing your selections or removing a filter means even more steps. The fine-grained control it gives is at the cost of convenience.
The filtered search experience on mobile devices is worse because of the small screen sizes and related problems like the fat finger syndrome which can lead to unintended selection of choices while tapping or scrolling.
During holiday gifting seasons, your customers are likely to buy a larger number of products than normal. They may have additional constraints on their purchases, such as the total price or the range of acceptable delivery dates.
Ideally, your user interface should let your customers specify such concerns too. However, the very nature of user interfaces makes it difficult to implement such special considerations.
Such inconveniences increase search abandonment — your customers simply stop searching and close your website out of frustration. According to a recent survey commissioned by Google Cloud, $2 trillion is lost to search abandonment and 82% of customers go on to avoid websites where they've abandoned searches.
You can easily overcome all these drawbacks by using GPT-3/GPT-4 chatbots. They enable you to simulate a shopping assistant that a customer can talk to naturally and work through complex or abstract shopping ideas.
The chatbot conducts a human-like conversation with your customer. It asks guiding questions to find out what the customer has in mind and progressively shortlists products based on the answers. With the patience and attentiveness of a truly human assistant, the chatbot obeys all the criteria a customer gives, no matter how trivial.
Your customers can chat with it either by typing or by talking into their mobile device. For the latter, text-to-speech services, like OpenAI's Whisper, convert customer speech to text for better experiences on mobile devices, where typing lots of text can be clumsy and error-prone.
In the example below, a GPT shopping chatbot guides a customer looking for jeans. The chatbot's questions are in blue, and the customer's replies are in black.
Notice that the customer is able to quickly specify product-level and cart-level criteria for multiple items. Later, the chatbot increases sales beyond their planned budget.
The image below shows another of our shopping chatbots in action:
The customer asks the bot for suggestions to go with these jeans. The bot provides a helpful answer in natural language just like a human assistant.
The chatbot example above shows some remarkable capabilities like:
All this is possible thanks to our fine-tuned GPT-3/GPT-4 models and our pipeline to integrate downstream data into GPT output. The pipeline architecture looks like this:
Let's take a closer look at what each component does.
This component manages the overall conversation. It examines your message or answer and decides whether the query requires some dynamic information from downstream. If not, it routes the query to the non-dynamic GPT model. Otherwise, it routes the query to the subsystem that handles dynamic information like product details, availability, and prices.
The same model also handles queries for aspects that don't change often, such as general information about product categories or payment options, and generates responses that don't require any downstream information.
This model handles user queries that require dynamic information or downstream APIs by building suitable answers and including relevant dynamic information from your inventory database. Finding the most relevant information for a given query is done with help from the next two components. The idea here is to be able to answer queries such as “how many red shirts do you have in stock?” or “do you sell cotton Dior sweatshirts” with a real-time answer based on your current inventory.
When the chatbot gets a query, the Q&A GPT model must find the products whose details and other real-time context — like prices and discount campaigns — match. These details aren't part of the GPT model and shouldn't be either, due to their dynamic nature. Instead, we inject all these details dynamically in the prompt as context and few-shot examples.
We have empirically observed that if the context and few-shot examples are semantically relevant to the query to a high degree, the quality of GPT's answers also improve drastically. To fetch only the most semantically relevant details, we use a Sentence BERT (SBERT) model. It encodes all these product details and other real-time context documents as embeddings and stores them in a production-grade vector database. A typical shopping site can produce millions to billions of embeddings for its inventory.
When the query is received, we use SBERT to calculate the query embedding. The vector database then uses cosine similarity to find the stored embeddings that are most semantically relevant to the query embedding. These matching details and real-time context are inserted into the prompt before the user query and sent to the GPT model.
The embeddings are stored in a production-grade vector database like Pinecone. Its role is to match the embeddings for the query and the products to find the product details and real-time context that are most semantically similar to the query. This database is an index of the current data in the AWS RDS instance.
Customers find that the conversational interfaces of shopping assistant chatbots are far more convenient. They help reduce buyer frustration, choice overload, search abandonment, and cart abandonment.
They can also increase conversions and convince customers to spend more by suggesting better choices in the vicinity of the customers' budgets.
In the sectors of banking, insurance, healthcare, and government, information for existing and potential customers can be complex. People approach such services with specific information needs in mind. To help them, companies publish content like frequently asked questions (FAQs) and knowledge base articles.
However, that content may not always provide direct answers or may use different phrasing than what people ask. Plus, many people won't read through the provided content to find answers. Moreover, many questions may involve complex banking processes that don't have easy answers and require details from multiple sources to figure out. For example, a customer who wants a suspicious credit card transaction looked at and removed is a problem that's not easy to figure out by themselves. Information desks where a customer can speak with a human assistant are useful, but they are not feasible in all locations or at all times. Unanswered questions may result in lost conversions and sales. To avoid that, companies can use question-answering (QA) pipelines in chatbots that are available 24/7 and trained to find the most relevant answers in the content.
While custom deep learning models for QA are available, they require a lot of training data to achieve high quality. In contrast, GPT is already pretrained on reams of real-world data, making it a much more capable repository of information for multiple domains. Minimal fine-tuning on your company-specific content is sufficient to get high quality answers from GPT.
In the example below, a GPT chatbot for a bank answers a potential customer's specific information need accurately and succinctly, saving the customer from wading through pages of content:
The pipeline for an information desk chatbot is the same as the one above for shopping chatbots but the data is different. We'll walk you through the high-level steps that go into readying your information desk chatbot.
First, we collect all the useful static content like your FAQs, knowledge base, and website pages that contain general information relevant to your customers. This information isn't specific to a user or their account, but something that's applicable to everyone, like in this example:
Potential questions and informational facts are extracted from such content using manual annotation or web scraping, and stored in a knowledge base database for answering queries.
Prompts are key to getting the most out of GPT. When we build GPT chatbots, we must frame the relevant details and questions in particular ways for GPT to interpret them correctly. We must provide a few examples of ideal prompts and answers (few-shot learning) to GPT so that it can dynamically figure out what's expected of it based on the patterns in the examples.
We do this by maintaining a database of gold-standard prompts and answers. When a customer query is received, we dynamically choose the most relevant examples from that database and prefix it to the customer's query before asking GPT. This ensures that GPT interprets the query correctly and returns the expected response.
Another important aspect in this phase is that instead of hardcoding details and links, we train GPT to output placeholder variables. These variables are replaced later with customer-specific information to provide personalized answers. For example, currencies and policy pages may be different in each country. So we ask GPT to generate placeholders for them instead of hardcoding a currency or page link.
In the example below, GPT outputs a placeholder for the link to a pricing page which will be replaced later in the pipeline:
In addition to few-shot examples and prompt optimization, another step that can potentially improve the quality of answers is fine-tuning the GPT model by supplying a dataset of questions and answers. Fine-tuning essentially creates a custom GPT model, stored on OpenAI's systems, that's available only to your company. It's a good approach if the nature of information, prompt syntax, and answer formats are very different and domain-specific compared to the standard text generated by GPT.
The essential idea here is that given a customer query, look up our database of extracted questions and answers, find the question that is most similar to the customer query, and return the associated answer for that extracted question.
To implement this idea of finding the most similar question, we convert all questions and queries to math forms called embeddings. They are essentially vectors that encode various linguistic and contextual information as numbers. Once converted to vectors, we can use math techniques like cosine similarity to find a question vector that is similar to a customer query vector. The answer associated with that question vector is then the most relevant answer to the customer's query as well.
For converting questions and queries to vectors, we use a model called Sentence-BERT from the sentence transformers library. It provides excellent results for such similarity tasks. This architecture is also fine-tunable with real similarity pairs which allows us to boost the accuracy for specific use cases.
In the previous step, there are likely to be thousands or even millions of questions and answers. So a system that can store millions of vectors and calculate similarities quickly is necessary. Such systems are called vector databases, and Pinecone and FAISS are some popular options.
In you're in an information-intensive industry, these chatbots offer multiple business benefits over traditional information desks:
Let's shift our focus now from customers to sales professionals. How can our GPT systems can help your sales and business development professionals improve their outcomes and productivity?
Instead of sending generic emails to your leads, your outreach will get better responses if you can personalize them for each lead. Beyond greeting your leads by name, GPT can personalize the email's communication style to suit the lead's personality. Sources of personalized information include their personal and company LinkedIn profiles, product and service offerings on their websites, their financial reports, and other information.
Conceptually, the GPT pipeline here is the same as the one above for information desk chatbots. However, the content here comes from the sources mentioned above and its output is a personalized email message generated by GPT like in the example below:
In the example above, the GPT pipeline examines a prospect's online store, finds specific similarities in products, and sends a personalized email about them. The example also serves as a learning template. When given a list of hundreds of prospects and their websites, the pipeline can spit out hundreds of similar emails in seconds while automatically customizing details like "living room collection" and "furniture store" with products and success stories suited to each business.
You can use similar GPT pipelines in every step of your sales process:
The pipeline consists of two stages: transcription and summarization.
In the transcription stage, a transcription service like OpenAI Whisper converts your call's audio to a high-quality, low-error text conversation with features like speaker name recognition, automatic speaker labeling, multilingual speech recognition, filler sound removal, and code switching support (i.e., transcribing even if speakers switch between different languages in the same sentence).
In the summarization stage, GPT takes the call transcript and summarizes it. It automatically identifies key points and keeps them in the summary. Other sentences, like pleasantries, are discarded. We explore some important aspects of summarization implementation in more depth below.
Sales calls can go on for hours and feature multiple speakers, resulting in transcripts with hundreds of thousands of words. That's a problem because for all their great communication skills, GPT-3 had a prompt limit of around 4,000 tokens, while the GPT-4's prompt can range from 8,000-32,000 tokens. In fact, these are not full word limits, but subword limits, which means the actual word limits are about half of these. How can we summarize long call transcripts under such limitations?
For that, we use custom chunking algorithms to break up the transcript into smaller pieces and process them individually using GPT. To not lose any essential context, chunking must be done carefully, using techniques like topic extraction on each section and prefacing the chunk with the extracted topic.
But key points and action items aren't the only details you can obtain. In the next section, you'll see how you can get deeper information about your calls.
In addition to summaries and action items, GPT pipelines enable you to dig out deeper information from your sales calls like:
For example, from this example business call transcript, GPT can extract the following information.
It identifies the speakers named in the transcript:
It identifies tasks and assignees in the transcript:
It can report the percentage spoken by each speaker:
It can classify the sentiment of each sentence. Salespeople can then focus on the parts that were perceived negatively:
Large language models like GPT-3, GPT-4, GPT-J, and LlaMA are paradigm shifts in language processing capabilities. The potential benefits for sales roles in productivity and metrics are massive. Learn how you can unlock their awesome powers for your specific sales strategies. Contact us today.
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Learn how CLIPSeg segmentation, in combination with GPT-4 and ChatGPT, can enable diverse applications from medical image diagnosis to remote sensing.
Can GPT-4 make your life as a finance or banking employee easier? Learn how GPT-4 and NLP can be used in finance to increase revenues and streamline workflows.
A deep dive into how we reached SOTA accuracy in product similarity matching through a custom fine-tuning pipeline that refines the CLIP model for image similarity.
Explore the use of GPT for opinion summarization through innovative pipeline methods, evaluation metrics like ROUGE and BERTScore, and human evaluation insights. Dive into novel entailment-based evaluation tools for a comprehensive understanding of model performance in capturing diverse user opinions.
Come aboard the large language model revolution with our deep dive on AI21 vs. GPT-3 for business use cases like ad copy generation and math proof generation.
A technical guide to using BERT for extractive summarization on lectures that outperforms other NLP models
Discover how prompt based LLMs like GPT-3 & GPT-4 are transforming news summarization with its zero-shot capabilities and adaptability to specialized tasks like keyword-based summarization. Learn about the limitations of current evaluation metrics and the potential future directions in text summarization research.
Discover the PEZ method for learning hard prompts through optimization, a powerful technique that enhances generative models for image generation and language tasks, improves transferability, and enables few-shot learning
Take a look at how Width.ai built 17 generative ai pipelines for use in the Keap.com marketing copy generation product
A deep look at how recurrent feature reasoning outperforms other image inpainting methods for difficult use cases and popular datasets.
See a comparison of GPT-3 vs. GPT-J, a self-hosted, customizable, open-source transformer-based large language model you can use for your business workflows.
Discover how transformer networks are revolutionizing image and video segmentation, and get insights on modern semantic segmentation vs. instance segmentation.
Discover how the state-of-the-art mask-aware transformer produces visually stunning and semantically meaningful images and how it stacks up against Stable Diffusion & DALL-E for large-hole inpainting
Unlock the full potential of spaCy with this guide to building production-grade text classification pipelines for business data.
We compare 12 AI text summarization models through a series of tests to see how BART text summarization holds up against GPT-3, PEGASUS, and more.
Let’s take a look at what intent classification is in conversational ai and how you can build a GPT-3 intent classification model for conversational ai and chatbot pipelines.
Discover the capabilities of zero-shot object detection, which enables anyone to use a model out-of-the-box without any training and generate production-grade results.
What is facial expression recognition and what SOTA models are being used today in production
Get a simple TensorFlow facial recognition model up & running quickly with this tutorial aimed at using it in your personal spaces on smartphones & IoT devices.
Explore accurate classification algorithms using the latest innovations in deep learning, computer vision, and natural language processing.
Learn what human activity recognition means, how it works, and how it’s implemented in various industries using the latest advances in artificial intelligence.
What is the the SetFit architecture and how does it outperform GPT-3 and other few shot large language models
What is image classification and how we build production level TensorFlow image classification systems for recognizing various products on a retail shelf.
Explore the application of intelligent document processing (IDP) in different industries and dive in-depth on intelligent document pipelines.
How to build an image classification model in PyTorch with a real world use case. How you can perform product recognition with image classification
Let's build a custom CTA generator that you'll actually want to use for your website copy
We’re going to look at how we built a state of the art NLP pipeline for blended summarization and NER to process master service agreements (MDAs) that vary the outputs based on the input document and what is deemed important information.
Get a comprehensive overview of a purchase order vs. invoice, including when businesses use each, what information goes in them, and more.
Learn what Google Shopping categories are used for and how you can automate fitting products to this taxonomy using ai.
Automatically categorize your Shopify store products to the Shopify Product Taxonomy instantly with ai based PIM software
Dive deep into 3-way invoice matching, including how it works, eight benefits for your business, and the problems with doing it manually.
Smart farming using computer vision and deep learning provides the most promising path forward in the slow-moving industry of agriculture.
How we leveraged large language models to build a legal clause rewriting pipeline that generates stronger language and more clarity in legal clauses
Using ai for document information extraction to automate various parts of the loan process.
Apply AI to your favorite sport with this guide. Learn how automated ball tracking can change the game for coaches and players.
Categorize your ecommerce products to the 2021 google product taxonomy tree instantly with our Ai software
Surveying the current landscape of ecommerce automation and how you can use ai to automate huge chunks of your product management.
Classify your product data against an existing product category database or generate categories and tags in seconds using artificial intelligence
Warehouse automation plays a crucial role across your supply chain. Learn about how machine learning and ai software can be integrated into your warehouse automation stack.
4 different NLP methods of summarizing longer input text into different methods such as extractive, abstractive, and blended summarization
iscover an invoice OCR tool that will revolutionize the way you handle invoices. There’s no human intervention needed & a dramatically lower per-invoice cost.
Instead of invoice matching taking upwards of a week, it could take mere seconds with the proper automation solution. Learn more here.
Manual and template-based invoicing are riddled with low accuracy and required human intervention. Learn how to systematically eliminate these issues with the right invoice data capture software.
A complete walkthrough guide on how to use visual search in ecommerce stores to create more sales and real examples of companies already using it.
Automating the extraction of data from invoices can reduce the stress of your accountants by finding inaccuracies, digitizing paper invoices, and more.
How you can optimize email marketing campaigns with machine learning based models that improve conversion & click-through rates.
How you can use machine learning based data matching to compare data features in a scalable architecture for deduping, record merging, and operational efficiency
Learn how lifetime value or LTV prediction can improve your marketing strategies. Then, discover the best statistical & machine learning models for your predictions.
A deep understanding of how we use gpt-3 and other NLP processes to build flexible chatbot architectures that can handle negotiation, multiple conversation turns, and multiple sales tactics to increase conversions.
The popular HR company O.C. Tanner, which has been in business since 1927 and has over 1500 employees, was looking to research and design two GPT-3 software products to be used as internal tools with their clients. GPT-3 based products can be difficult to outline and design given the sheer lack of publicly available information around optimizing and improving these systems to a production level.
We’ll compare Tableau vs QlikView in terms of popularity, integrations, ease of use, performance, security, customization, and more.
With a context-aware recommender system, you can plan ways to recreate some of the contextual conditions that persuade them to buy more from you.
We’re going to walk through building a production level twitter sentiment analysis classifier using GPT-3 with the popular tweet dataset Sentiment140.
Find out how machine learning in medical imaging is transforming the healthcare world and making it more efficient with three use cases.
Discover ways that machine learning in health care informatics has become indispensable. Review the results of two case studies and consider two key challenges.
Accelerate your growth by pivoting key areas of your business to AI. Your business outcomes will be achieved quicker & you’ll see benefits you didn’t plan for.
We built a GPT-3 based software solution to automate raw data processing and data classification. Our model handles keyword extraction, named entity recognition, text classification | Case Study
We built a custom GPT-3 pipeline for key topic extraction for an asset management company that can be used across the financial domain | Case Study
How you can use GPT-3 to create higher order product categorization and product tagging from your ecommerce listings, and how you can create a powerful product taxonomy system with ai.
5 ways you can use product matching software in ecommerce to create real value that raises your sales metrics and improves your workflow operations.
Data mining and machine learning in cybersecurity enable businesses to ensure an acceptable level of data security 24/7 in highly dynamic IT environments. Learn how data security is getting increasingly automated.
Product recognition software has tremendous potential to improve your profits and slash your costs in your retail business. Find out just how useful it is.
Big data has evolved from hype to a crucial part of scaling your organization in every modern industry. Learn more about how big data is transforming organizations and providing business impacts.
Learn how natural language processing can benefit everybody involved in education from individual students and teachers to entire universities and mass testing agencies.
Here’s how automated data capture systems can benefit your business in some key ways and some real-life examples of what it looks like in practice.
Use these power ai and machine learning tools to create business intelligence in your marketing that pushes your business understanding and analytics past your competition.
We built a custom ML pipeline to automate information extraction and fine tuned it for the legal document domain.
In this practical guide, you'll get to know the principles, architectures, and technologies used for building a data lake implementation.
Find out how machine learning in biology is accelerating research and innovation in the areas of cancer treatment, medical devices, and more.
An enterprise data warehouse (EDW) is a repository of big data for an enterprise. It’s almost exclusive to business and houses a very specific type of data.
Dlib is a versatile and well-diffused facial recognition library, with perhaps an ideal balance of resource usage, accuracy and latency, suited for real-time face recognition in mobile app development. It's becoming a common and possibly even essential library in the facial recognition landscape, and, even in the face of more recent contenders, is a strong candidate for your computer vision and facial recognition or detection framework.
Learn how to utilize machine learning to get a higher customer retention rate with this step-by-step guide to a churn prediction model.
Machine learning algorithms are helping the oil and gas industry cut costs and improve efficiency. We'll show you how.
We’ll show you the difference between machine learning vs. data mining so you know how to implement them in your organization.
Here’s why you should use deep learning algorithms in your business, along with some real-world examples to help you see the potential.
Beam search is an algorithm used in many NLP and speech recognition models as a final decision making layer to choose the best output given target variables like maximum probability or next output character.
Best Place For was looking for an image recognition based software solution that could be used to detect and identify different food dishes, drinks, and menu items in images sourced from blogs and Instagram. The images would be pulled from restaurant locations on Instagram and different menu items would be identified in the images. This software solution has to be able to handle high and low quality images and still perform at the highest production level, while accounting for runtime as well as accuracy.
Deep learning recommendation system architectures make use of multiple simpler approaches in order to remediate the shortcomings of any single approach to extracting, transforming and vectorizing a large corpus of data into a useful recommendation for an end user.
Let's take a look at the architecture used to build neural collaborative filtering algorithms for recommendation systems
GPT-3 is one of the most versatile and transformative components that you can include in your framework, application or service. However, sensational headlines have obscured its wide range of capabilities since its launch. Let’s take a look at the ways that companies and researchers are achieving real-world results with GPT-3, and examine the untapped potential of this 'celebrity AI'.
How to get started with machine learning based dynamic pricing algorithms for price optimization and revenue management
Let's take a look at how you can use spaCy, a state of the art natural language processing tool, to build custom software tools for your business that increase ROI and give you data insights your competitors wish they had.
The landscape for AI in ecommerce has changed a lot recently. Some of the most popular products and approaches have been compromised or undermined in a very short time by a new global impetus for privacy reform, and by the way that the COVID-19 pandemic has transformed the nature of retail.
Extremely High ROI Computer Vision Applications Examples Across Different Industries
Building Data Capture Services To Collect High ROI Business Data With Machine Learning and AI
Software packages and Inventory Data tools that you definitely need for all automated warehouse solutions
Inventory automation with computer vision - how to use computer vision in online retail to automate backend inventory processes