A Deep Guide to Text-Guided Open-Vocabulary Segmentation
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Let’s take a look at what intent classification is in conversational ai and how you can build a GPT-3 intent classification model for conversational ai and chatbot pipelines.
Understanding the intent of a user query into a chatbot is a key part of being able to kick off downstream operations in a dynamic chatbot. These downstream processes can be an embedded vector search, a code process, or anything else that cannot be handled by our actual model used for chat functionality.
Intent classification just means we are using a text classification model trained to recognize the intent of a user and how it correlates with our operations. In non-dynamic chatbots, otherwise known as chatbots where the knowledge can be trained into a model and does not change often (marketing best practices, company documents, generic conversation), this intent classification does not have as much value given we don’t need as many downstream processes.
But for chatbots where we users can kick off downstream processes based on what they say or the action they seem to be asking for, being able to classify these actions is mandatory for production level systems. We need to be able to understand if our base conversational model can handle the user request or query or if a downstream process is supposed to retrieve some information to respond. How does intent classification work and how is it used in a chatbot? Let’s look at an example.
The use case of an ecommerce chatbot that helps users find products they want but aren't totally sure what they’re looking for is a great example of a dynamic chatbot that requires intent classification. Our conversational model can respond to general questions about the store, the company, or any other information that does not change often (non-dynamic). But for data that is constantly changing such as product data, it wouldn’t make sense to train a model on that data, as we’d have to retrain every time the data changes.
Instead what we can do is train a model to classify the intent of the user and respond if our conversational model can handle it, or recognize they’re looking for a product and kick off a downstream process to handle the product search. This works for any number of downstream processes we want to have as options, and just requires us to create more classes.
We’re going to leverage GPT-3 as our natural language understanding model for classifying user inputs and helping our downstream conversational model. In some use cases, our conversational model and intent classification model are the same where we train one intent classification to generate a response, and the others to generate a “tag” that tells downstream processes to start.
For this example use case, we’re going to separate the conversational model and intent classification model. User inputs or queries will be passed into the classification model, any processes required will occur, and then our conversational model will take the results and generate a response. We’ll use an ecommerce support chatbot that handles a few classes as an example.
Our dataset has a very simple structure as most text classification datasets do. We’re going to use a few classes for our different customer intent user queries:
From there we want to annotate the dataset for our different intent classifications. Our labels are simply what we want the model to generate when it thinks the user query is a specific class. This annotation can be done manually or sped up through few shot learning. For the few shot learning we want to create a GPT-3 prompt with a few examples of how to complete the task. We can run the rest of the dataset through the model and then review the outputs. This process is much quicker than annotating the entire dataset and becomes quicker each time you add to the few shot learning model.
There is no hard and fast rule for the number of fine-tuning examples that you should use for your intent classification model. It will depend on the size of the LLM and pretraining, the quality of the data and the complexity of the user queries, and the amount of overfitting you are willing to tolerate during the fine-tuning process. In general, it is better to have more examples rather than fewer, as long as we define training examples that are of high quality and are representative of real user queries to our system based on the use case. However, it is also important to keep in mind that adding more data will not necessarily improve the model if the data is not relevant to customer interactions. It is important that you have similar numbers of examples per class to prevent class imbalance. I’d recommend doing iterations of collecting data, annotating data, fine-tuning GPT-3 and evaluating model and data accuracy, and making a decision to repeat the iteration.
Make sure you have a stop sequence that shows the model what the end of a sequence looks like. This should be a token that does not show up in your completion. Each completion should start with whitespace as well.
Now that we have our dataset ready to go we can fine-tune our model to classify these intents. We’ll be walking through this at the command line but will have a bit of python as well. You’ll first need to install openai and add your API key as a variable called OPENAI_API_KEY.
The training data must be in a JSONL file format with the prompt and completion input and output format. The prompt should be our provided user query and our completion is our classification. More advanced systems can have a bit of text to go along with the classification to provide the user with a bit of information about what we’re doing on our side. Something like “Give me one moment while I look this up <<Human Support>>”. OpenAi also has a very easy to use CLI data preparation tool to take your input data in different formats and put it in JSONL.
This tool takes in different types such as CSV, TSV, XLSX, JSON and, JSONL. It suggests changes to make your data compatible with fine-tuning!
Now we can create a fine-tuned model! The model you choose to use should depend on your constraints around cost and runtime you care about for your chatbot. Remember this is just a piece of the entire pipeline and isn’t the only model or service you’ll be using. Here are a few things to consider:
1. Davinci is the most accurate model when comparing base forms (just the pretrained models). I recommend using it if you don’t have a lot of training samples for the different classes, or not have a lot of examples per class. The equation of what model to use completely changes once you have a lot of data. Fine-tuning Ada will be much cheaper and faster and could compete with davinci in your use case once fine-tuned. All rules about model size and accuracy go out the door when talking about fine-tuning and honing in a specific use case (I proved this here).
2. Your prompts used for fine-tuning are not at the same level of complexity as what is required for prompt based LLM interactions. You do not want to have a few-shot learning prompt for fine-tuning as you might have for interacting with the base models.
3. Classification is a fairly easy task for most LLMs which means the models require less data to get “familiar” with the task you’re trying to accomplish. I’d focus on data variance coverage and showing these models the range of inputs that you expect to see. This is much more valuable to me than trying to hammer in accuracy on a smaller data variance set.
4. Do not rely on only quality inputs. In my experience, most users that use chatbots use short forms, poor grammar, and statements rather than quality questions with punctuation. Keep this level of data variance in your samples so your model learns a stronger correlation to real user runs. Only after fine-tuning the model and seeing low accuracy would I go back and add more complete statements.
5. Remove junk characters. This should be standard in an NLP problem but is extra important in our shorter form inputs.
6. Your training samples should be vetted by humans. Ensure that the results are correct and that your classes make sense. The model is going to treat these as golden quality.
Take a look at the OpenAi fine-tuning guide for further information on the fine-tuning process.
With your intent classification model, you can now say you’ve got one of the pieces in place for building a full production GPT-3 chatbot. Intent classifiers are one of the most valuable parts of the equation as it kicks off downstream processes and helps our conversational model better understand where we are in the conversation. We use these exact models as a part of our production grade GPT-3 chatbots.
The use of customizable and in-domain chatbots continues to grow as more companies look to provide easier assistance to users for services.
Sales driven chatbots can become 30% of a store's sales. (Source)
Abandoned cart chatbots can boost revenue by 25%. (Source: Chatbots Mag)
Intent classification will continue to be the foundation of chatbots no matter the platform as long as we continue to push the boundaries of what we ask these pipelines to do. As we continue to ask chatbots to perform downstream tasks outside of generic conversation the accuracy of these systems will be depended on more and more.
Now that you have a custom intent detection model with GPT-3, you may be wondering how to make it more efficient and accurate. To do this, you can use a few different techniques such as active learning, transfer learning, and hyperparameter tuning. Active learning is a process of utilizing user feedback to improve your model based on user actions. By having users input their queries into the bot and manually providing feedback, you can use this data to continue to refine your model and make it more accurate. Transfer learning is another method of improving your model's accuracy. By taking a model that has been pre-trained on a large dataset, you can use this to fine-tune your model and achieve better performance with less data. Finally, hyperparameter tuning is a method of optimizing your model by changing the parameters used in the training process. By making small changes to the learning rate, batch size, and other parameters, you can improve the accuracy of your model. These techniques can all be used to improve the efficiency and accuracy of your custom intent classification model for conversational AI. With the right combination of techniques and data, you can create a model that is efficient, accurate, and capable of handling a wide range of user queries.
Width.ai builds custom natural language processing software (like chatbots!) for companies looking to leverage models to automate business processes or expand product capabilities. We’ve built chatbots for sales, ecommerce, and for automating coaching. Let’s set up some time to chat about how a chatbot fits with your business or how intent classification can help your chatbot.
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Learn how CLIPSeg segmentation, in combination with GPT-4 and ChatGPT, can enable diverse applications from medical image diagnosis to remote sensing.
Can GPT-4 make your life as a finance or banking employee easier? Learn how GPT-4 and NLP can be used in finance to increase revenues and streamline workflows.
A deep dive into how we reached SOTA accuracy in product similarity matching through a custom fine-tuning pipeline that refines the CLIP model for image similarity.
Boost your conversions and sales numbers with NLP in sales using OpenAI's GPT-3 and GPT-4. You can use chatbots to improve customer experience and loyalty.
Explore the use of GPT for opinion summarization through innovative pipeline methods, evaluation metrics like ROUGE and BERTScore, and human evaluation insights. Dive into novel entailment-based evaluation tools for a comprehensive understanding of model performance in capturing diverse user opinions.
Come aboard the large language model revolution with our deep dive on AI21 vs. GPT-3 for business use cases like ad copy generation and math proof generation.
A technical guide to using BERT for extractive summarization on lectures that outperforms other NLP models
Discover how prompt based LLMs like GPT-3 & GPT-4 are transforming news summarization with its zero-shot capabilities and adaptability to specialized tasks like keyword-based summarization. Learn about the limitations of current evaluation metrics and the potential future directions in text summarization research.
Discover the PEZ method for learning hard prompts through optimization, a powerful technique that enhances generative models for image generation and language tasks, improves transferability, and enables few-shot learning
Take a look at how Width.ai built 17 generative ai pipelines for use in the Keap.com marketing copy generation product
A deep look at how recurrent feature reasoning outperforms other image inpainting methods for difficult use cases and popular datasets.
See a comparison of GPT-3 vs. GPT-J, a self-hosted, customizable, open-source transformer-based large language model you can use for your business workflows.
Discover how transformer networks are revolutionizing image and video segmentation, and get insights on modern semantic segmentation vs. instance segmentation.
Discover how the state-of-the-art mask-aware transformer produces visually stunning and semantically meaningful images and how it stacks up against Stable Diffusion & DALL-E for large-hole inpainting
Unlock the full potential of spaCy with this guide to building production-grade text classification pipelines for business data.
We compare 12 AI text summarization models through a series of tests to see how BART text summarization holds up against GPT-3, PEGASUS, and more.
Discover the capabilities of zero-shot object detection, which enables anyone to use a model out-of-the-box without any training and generate production-grade results.
What is facial expression recognition and what SOTA models are being used today in production
Get a simple TensorFlow facial recognition model up & running quickly with this tutorial aimed at using it in your personal spaces on smartphones & IoT devices.
Explore accurate classification algorithms using the latest innovations in deep learning, computer vision, and natural language processing.
Learn what human activity recognition means, how it works, and how it’s implemented in various industries using the latest advances in artificial intelligence.
What is the the SetFit architecture and how does it outperform GPT-3 and other few shot large language models
What is image classification and how we build production level TensorFlow image classification systems for recognizing various products on a retail shelf.
Explore the application of intelligent document processing (IDP) in different industries and dive in-depth on intelligent document pipelines.
How to build an image classification model in PyTorch with a real world use case. How you can perform product recognition with image classification
Let's build a custom CTA generator that you'll actually want to use for your website copy
We’re going to look at how we built a state of the art NLP pipeline for blended summarization and NER to process master service agreements (MDAs) that vary the outputs based on the input document and what is deemed important information.
Get a comprehensive overview of a purchase order vs. invoice, including when businesses use each, what information goes in them, and more.
Learn what Google Shopping categories are used for and how you can automate fitting products to this taxonomy using ai.
Automatically categorize your Shopify store products to the Shopify Product Taxonomy instantly with ai based PIM software
Dive deep into 3-way invoice matching, including how it works, eight benefits for your business, and the problems with doing it manually.
Smart farming using computer vision and deep learning provides the most promising path forward in the slow-moving industry of agriculture.
How we leveraged large language models to build a legal clause rewriting pipeline that generates stronger language and more clarity in legal clauses
Using ai for document information extraction to automate various parts of the loan process.
Apply AI to your favorite sport with this guide. Learn how automated ball tracking can change the game for coaches and players.
Categorize your ecommerce products to the 2021 google product taxonomy tree instantly with our Ai software
Surveying the current landscape of ecommerce automation and how you can use ai to automate huge chunks of your product management.
Classify your product data against an existing product category database or generate categories and tags in seconds using artificial intelligence
Warehouse automation plays a crucial role across your supply chain. Learn about how machine learning and ai software can be integrated into your warehouse automation stack.
4 different NLP methods of summarizing longer input text into different methods such as extractive, abstractive, and blended summarization
iscover an invoice OCR tool that will revolutionize the way you handle invoices. There’s no human intervention needed & a dramatically lower per-invoice cost.
Instead of invoice matching taking upwards of a week, it could take mere seconds with the proper automation solution. Learn more here.
Manual and template-based invoicing are riddled with low accuracy and required human intervention. Learn how to systematically eliminate these issues with the right invoice data capture software.
A complete walkthrough guide on how to use visual search in ecommerce stores to create more sales and real examples of companies already using it.
Automating the extraction of data from invoices can reduce the stress of your accountants by finding inaccuracies, digitizing paper invoices, and more.
How you can optimize email marketing campaigns with machine learning based models that improve conversion & click-through rates.
How you can use machine learning based data matching to compare data features in a scalable architecture for deduping, record merging, and operational efficiency
Learn how lifetime value or LTV prediction can improve your marketing strategies. Then, discover the best statistical & machine learning models for your predictions.
A deep understanding of how we use gpt-3 and other NLP processes to build flexible chatbot architectures that can handle negotiation, multiple conversation turns, and multiple sales tactics to increase conversions.
The popular HR company O.C. Tanner, which has been in business since 1927 and has over 1500 employees, was looking to research and design two GPT-3 software products to be used as internal tools with their clients. GPT-3 based products can be difficult to outline and design given the sheer lack of publicly available information around optimizing and improving these systems to a production level.
We’ll compare Tableau vs QlikView in terms of popularity, integrations, ease of use, performance, security, customization, and more.
With a context-aware recommender system, you can plan ways to recreate some of the contextual conditions that persuade them to buy more from you.
We’re going to walk through building a production level twitter sentiment analysis classifier using GPT-3 with the popular tweet dataset Sentiment140.
Find out how machine learning in medical imaging is transforming the healthcare world and making it more efficient with three use cases.
Discover ways that machine learning in health care informatics has become indispensable. Review the results of two case studies and consider two key challenges.
Accelerate your growth by pivoting key areas of your business to AI. Your business outcomes will be achieved quicker & you’ll see benefits you didn’t plan for.
We built a GPT-3 based software solution to automate raw data processing and data classification. Our model handles keyword extraction, named entity recognition, text classification | Case Study
We built a custom GPT-3 pipeline for key topic extraction for an asset management company that can be used across the financial domain | Case Study
How you can use GPT-3 to create higher order product categorization and product tagging from your ecommerce listings, and how you can create a powerful product taxonomy system with ai.
5 ways you can use product matching software in ecommerce to create real value that raises your sales metrics and improves your workflow operations.
Data mining and machine learning in cybersecurity enable businesses to ensure an acceptable level of data security 24/7 in highly dynamic IT environments. Learn how data security is getting increasingly automated.
Product recognition software has tremendous potential to improve your profits and slash your costs in your retail business. Find out just how useful it is.
Big data has evolved from hype to a crucial part of scaling your organization in every modern industry. Learn more about how big data is transforming organizations and providing business impacts.
Learn how natural language processing can benefit everybody involved in education from individual students and teachers to entire universities and mass testing agencies.
Here’s how automated data capture systems can benefit your business in some key ways and some real-life examples of what it looks like in practice.
Use these power ai and machine learning tools to create business intelligence in your marketing that pushes your business understanding and analytics past your competition.
We built a custom ML pipeline to automate information extraction and fine tuned it for the legal document domain.
In this practical guide, you'll get to know the principles, architectures, and technologies used for building a data lake implementation.
Find out how machine learning in biology is accelerating research and innovation in the areas of cancer treatment, medical devices, and more.
An enterprise data warehouse (EDW) is a repository of big data for an enterprise. It’s almost exclusive to business and houses a very specific type of data.
Dlib is a versatile and well-diffused facial recognition library, with perhaps an ideal balance of resource usage, accuracy and latency, suited for real-time face recognition in mobile app development. It's becoming a common and possibly even essential library in the facial recognition landscape, and, even in the face of more recent contenders, is a strong candidate for your computer vision and facial recognition or detection framework.
Learn how to utilize machine learning to get a higher customer retention rate with this step-by-step guide to a churn prediction model.
Machine learning algorithms are helping the oil and gas industry cut costs and improve efficiency. We'll show you how.
We’ll show you the difference between machine learning vs. data mining so you know how to implement them in your organization.
Here’s why you should use deep learning algorithms in your business, along with some real-world examples to help you see the potential.
Beam search is an algorithm used in many NLP and speech recognition models as a final decision making layer to choose the best output given target variables like maximum probability or next output character.
Best Place For was looking for an image recognition based software solution that could be used to detect and identify different food dishes, drinks, and menu items in images sourced from blogs and Instagram. The images would be pulled from restaurant locations on Instagram and different menu items would be identified in the images. This software solution has to be able to handle high and low quality images and still perform at the highest production level, while accounting for runtime as well as accuracy.
Deep learning recommendation system architectures make use of multiple simpler approaches in order to remediate the shortcomings of any single approach to extracting, transforming and vectorizing a large corpus of data into a useful recommendation for an end user.
Let's take a look at the architecture used to build neural collaborative filtering algorithms for recommendation systems
GPT-3 is one of the most versatile and transformative components that you can include in your framework, application or service. However, sensational headlines have obscured its wide range of capabilities since its launch. Let’s take a look at the ways that companies and researchers are achieving real-world results with GPT-3, and examine the untapped potential of this 'celebrity AI'.
How to get started with machine learning based dynamic pricing algorithms for price optimization and revenue management
Let's take a look at how you can use spaCy, a state of the art natural language processing tool, to build custom software tools for your business that increase ROI and give you data insights your competitors wish they had.
The landscape for AI in ecommerce has changed a lot recently. Some of the most popular products and approaches have been compromised or undermined in a very short time by a new global impetus for privacy reform, and by the way that the COVID-19 pandemic has transformed the nature of retail.
Extremely High ROI Computer Vision Applications Examples Across Different Industries
Building Data Capture Services To Collect High ROI Business Data With Machine Learning and AI
Software packages and Inventory Data tools that you definitely need for all automated warehouse solutions
Inventory automation with computer vision - how to use computer vision in online retail to automate backend inventory processes