A Deep Guide to Text-Guided Open-Vocabulary Segmentation
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Creating relevant tags and product data points that allow us to automatically categorize our ecommerce products is a powerful automation tool that not only saves us time and manual effort, but creates a taxonomy system that improves the process of a customer reaching our store to finding what they are looking for.
The goal is to give customers the quickest and easiest path to what they are looking for, which reduces lost customers. This research report found that when a customer searches for a product they want, 47% gave up after just one attempt, 23% tried 3 or more times.
By using GPT-3 to create our product tags and categories we allow the model to make decisions on what tags and categories match to what products based on slightly different criteria than outdated past systems would use.
We’ll break down how to build this system and show real examples of GPT-3 working on real product listings.
Product taxonomy is simply the process of organizing our ecommerce products into categories and tags that give us a system to get customers to the exact product they are looking for quicker. This includes creating categories, tags, attributes and more to create a hierarchy for similar products. Improving the time to sale number is a huge deal for ecommerce brands and the customer experience for users who come to our store.
A subprocess in the general idea of product taxonomy, categorization is the process of grouping these products with tags and attributes. We can think of this as the first step in the general taxonomy, where we use product categorization to create a system that takes advantage of what we’ve created here. GPT-3 will allow us to build an automated pipeline to do this effectively and use what we’ve generated in taxonomy.
Our focus starts with how we can use GPT-3 to generate product categories for us. We’ll use very simple product data that you already have - your product listing! We’ll take the product title, product description, and other information to create tags and categories. Let’s start with our first example product.
We’re going to create our categories and tags from this womens wool coat from Nordstrom. We’ve got a title, description, and details section.
We set up our product information in GPT-3 like this:
We’re going to start with zero shot learning. Zero shot learning is using GPT-3 and prompt information to generate output with no prior examples to use as reference. For any given task in GPT-3 this is the most raw result GPT-3 will give you back based on the model's understanding of the task. The output will change based on how you refine the different parameters that GPT-3 offers, which you can learn more about here. Without giving away too much information about our current processes, we asked GPT-3 to give us some categories and products for the above listing, without having seen any examples of what we expect.
This is pretty good! Through some prompt optimization and parameter tuning we were able to generate some pretty nice product categories and tags. A couple things to note about the output that is very important when you’re looking to get into product taxonomy.
If you actually look at this product listing on Nordstrom, you’ll see these are the main categories that this product falls under which valids our results.
Through some prompt and hyperparameter tweaks we can get the model to give us more refined information about the product and use them in tags. This is still just zero shot learning! We haven’t even shown the model examples of what correct output would be.
Now that we’ve seen what we can get to with zero shot learning, let's take a look at what we can do when we give GPT-3 some examples of what we want our output to be. You might wonder why we even want to do that if zero shot learning can reach results fairly easily. By showing the GPT-3 model some examples of output that we like we can adjust the model's willingness to use a deeper understanding of the prompt and start creating deeper meaning features.
Remember, GPT-3 tries to follow the instructions you give it, and use any prior examples to interpret what you mean. If our examples are very keyword heavy and just extract a lot of product describing features (like the “Product tags” example above) then our model will assume to do that for all future text generations. What we want to do is show the model some examples that include inferred categories and tags that require the model to not only extract information about the product listing but create new categories for us.
We’re going to add a few examples of product listings from the same site and generate categories and tags like we did before.
We’ll use output similar as we’ve seen where we generate product categories and product tags. This time however we’re going to add whatever season best fits the product. The model will have to learn through the information given to choose the correct season. There will be no mention of the season in the product listing, GPT-3 will have to learn how to reach that decision.
As we can see GPT-3 was able to learn how to deterministically create output based on the input text and a few previous examples. Not only did the model learn how to pick a season for the clothing item, but decide which keywords we care about using as tags and categories.
Understanding what fine tuning and prompt optimization does to our output tags and categories is critical to make enhancements and move the models generation logic towards what we are looking for. Here are some quick points about the different parameters and prompt of GPT-3.
- Length of product listing
- Variance in “type” of products used as examples. If you use only examples of women's scarves and try an iPad listing the model will struggle.
- Source of product data
- Contextual similarity in examples to the runtime ecommerce product
It’s no secret that potential customers being able to find exactly what they want to buy in the least amount of time and clicks is directly correlated to conversion rates. Research shows that only 23% of users try a search more than 2 times before giving up. The focus for product taxonomy has to be structuring our product pages and search results to give results that lead to the best conversion rates.
Accounting for the human element of website navigation and layout is a huge part of building a taxonomy plan that leads to increases in traffic and sales. Let’s take into account what we just learned about creating tags and categories, along with our product listing relative to these newly created attributes.
Taxonomy is normally split into two categories, hierarchical and faceted. Hierarchical taxonomy is the standard tree structure that you think of when you think of how products can be broken down into categories and subcategories. Facets is a structure where the product is broken down into attributes and “facets” of the product. This allows customers to find what they are looking for without knowing the specific name of the product, but the features that make it up.
Our GPT-3 based model can produce all of these tags for both hierarchical and facets, and allows us to take it a step further with our deterministic outputs we looked at above.
If we wanted to create the example product hierarchy above and produce tags for it we can by simply adjusting the gpt-3 model we saw before. Our initial version actually produced gender categories, and we can use the same deterministic approach to generalize tags into a clothing category.
Facets is even easier to cover, considering that we are already producing tags that are attributes of the product. If you wanted the model to be more focused on just extracting every keyword attribute you could turn the temperature of the model to 0 (which makes the model more argmax focused).
If we already have predefined categories that our products need to fall into, we can use GPT-3 in a different way to cluster similar products. Using the GPT-3 search api we can run a query across all of our products in a database and return them in order based on semantic similarity. There's two main options we have here:
The search api is tuned in a similar fashion to the model we saw before but the results are very different.
The documents (in our example these are our products) are returned with a score that says are similar each one is compared to the query. We’re not generating any text, so we don’t get any back like we did before. Most of the optimization and tuning is limited to the hyperparameters and the language we use in the query. Testing accuracy is a pretty straightforward process where we compare the top similarity to what we expected it to be on a test set.
If we want to further cluster all our products together based on the similarity results of either process we can use a clustering algorithm such as K-Means to group them together. Now we have an understanding of what products are similar based on:
The goal of product taxonomy is to increase sales and conversion rates. Your product listings need to make sense given what search they come up for. It’ll be hard to get a potential customer to click through to a product if the description showing on the search results is very different then what they had in mind. Related to this idea as well is that depending on how your search engine is set up, you have to make sure that the tags that are assigned to a product make sense for the search result they lead to. Just because you apply a “swimwear” tag to mens swim trunks doesn’t mean you want them to show up when someone searches “bikini swimwear” or “womens swimwear”.
This also falls into the idea of understanding your target user for a given product and the route they take to get there. You want to understand what type of taxonomy best fits your users and study how they interact with your site. See how they navigate around your site, how much time they spend on reading product descriptions vs attributes, and mine product data around their search bar behaviors.
Product taxonomy requires constant tweaks and optimizations as your potential customers change their behavior and as you change your site infrastructure. Making it easy to test the different pieces of your taxonomy system and create evaluation metrics that directly lead to outcome is the best way to stay on top of your taxonomy and quickly make adjustments.
We’ve focused on a machine learning focused system to taxonomy so far and the most successful ML products can easily be evaluated and optimized over time to hold a standard of accuracy. Building a test architecture for what we’ve looked at so far isn’t difficult, but does require you to put in the time and effort. With every part of your taxonomy system easily testable you’ll feel much more confident in the scalability and longevity of the powerful machine learning you’ve put in place.
Testing your system is more than just the machine learning models you’ve created to optimize and automate your process. You’ll need to test how changes to your taxonomy structure change conversion rates and interaction metrics among different groups of buyers. Here’s a quick example:
Building a direct and heuristic keyword model is a great way to understand not only how similar two products are, but how similar they are contextually. This model produces heuristic keywords that are contextually similar to the text. The idea is to use those as test searches in our ecommerce store and see if we still reach products that make sense for what we want to buy.
When looking to test our GPT-3 model’s output there are a bunch of different tools we can use based on the route we take. Let’s say we have this given output and we want to compare it to an expected output that we consider correct.
Fuzzy search allows us to compare two strings to each other for similarity and allow for some changes and typos. We can set max substitutions, deletions, insertions, and distance to tweak how similar we want the outputs to be. This is a great way to compare individual tags from our expected output to tags and categories.
This SpaCy algorithm is one we’ll reference throughout this section. We can transform our produced product categorization keyword/tag into the lemma form of the word to compare to our expected. Words like “are” or “them” are results that if the expected answer has the lemma form of, we probably want to mark as correct. Using this on top of keyword checks and fuzzy searches is a great way to remove false negatives.
Sense2vec is our favorite tool to use for ecommerce product categorization testing, as well as something we use across other NLP domains. This algorithm allows us to query for contextually similar keywords based on an input keyword and a part of speech. Not only does this cover different “versions” of the same word, but contextual similar words that we might consider to be the same. I recommend setting the baseline similarity score pretty high given that product names are already pretty close to begin with. In other domains such as direct and heuristic keyword extraction we normally set the similarity score much lower.
The most similar results are ranked from highest to lowest. My suggestion for ecommerce products is to use a keyword check on the results and only grab ones that contain a keyword from our original. For this example, we would use “wool” and “coat” and grab from here. The more adjectives our keyword contains the more refined we have to be in the results to make sure the results are actually similar to what we searched for.
If we refine our input keyword down too much there’s a chance sense2vec has never seen that keyword before. Our query isn’t even in the model. Best practices would say to spend time really understanding the average contextual similarity between these different product data points.
We can also use s2v to compare two different product tags or categories.
Keeping your structure simple and logical is the best way to ensure not only that your ecommerce taxonomy works, but that changes and optimizations are easy to make. When making decisions on sub-groups of categories always go with broad and shallow over super narrow with few products in each. Let the tags and search engine algorithm work to put the right products in front of the right searches. When it comes to categories, try to keep those as broad as possible with tags being what is used to rank products for various searches. This keeps everything organized, but allows for variance when users search your store. Avoid the “Other” category at all costs, nobody shops in the other category, they just leave your site.
Product taxonomy is mostly seen as an internal business management operation focused around product categorization and building hierarchies. However best practices also include an understanding of how these changes and optimizations affect how our products show up in search engines like google. On top of that, we need to be aware of what external search results map to our different pages and what that means for the user experience of those customers.
See how you can implement a GPT-3 model into your ecommerce business and start automating and optimizing tasks such as product categorization and taxonomy.
Book a free consultation with us for more!
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Learn how CLIPSeg segmentation, in combination with GPT-4 and ChatGPT, can enable diverse applications from medical image diagnosis to remote sensing.
Can GPT-4 make your life as a finance or banking employee easier? Learn how GPT-4 and NLP can be used in finance to increase revenues and streamline workflows.
A deep dive into how we reached SOTA accuracy in product similarity matching through a custom fine-tuning pipeline that refines the CLIP model for image similarity.
Boost your conversions and sales numbers with NLP in sales using OpenAI's GPT-3 and GPT-4. You can use chatbots to improve customer experience and loyalty.
Explore the use of GPT for opinion summarization through innovative pipeline methods, evaluation metrics like ROUGE and BERTScore, and human evaluation insights. Dive into novel entailment-based evaluation tools for a comprehensive understanding of model performance in capturing diverse user opinions.
Come aboard the large language model revolution with our deep dive on AI21 vs. GPT-3 for business use cases like ad copy generation and math proof generation.
A technical guide to using BERT for extractive summarization on lectures that outperforms other NLP models
Discover how prompt based LLMs like GPT-3 & GPT-4 are transforming news summarization with its zero-shot capabilities and adaptability to specialized tasks like keyword-based summarization. Learn about the limitations of current evaluation metrics and the potential future directions in text summarization research.
Discover the PEZ method for learning hard prompts through optimization, a powerful technique that enhances generative models for image generation and language tasks, improves transferability, and enables few-shot learning
Take a look at how Width.ai built 17 generative ai pipelines for use in the Keap.com marketing copy generation product
A deep look at how recurrent feature reasoning outperforms other image inpainting methods for difficult use cases and popular datasets.
See a comparison of GPT-3 vs. GPT-J, a self-hosted, customizable, open-source transformer-based large language model you can use for your business workflows.
Discover how transformer networks are revolutionizing image and video segmentation, and get insights on modern semantic segmentation vs. instance segmentation.
Discover how the state-of-the-art mask-aware transformer produces visually stunning and semantically meaningful images and how it stacks up against Stable Diffusion & DALL-E for large-hole inpainting
Unlock the full potential of spaCy with this guide to building production-grade text classification pipelines for business data.
We compare 12 AI text summarization models through a series of tests to see how BART text summarization holds up against GPT-3, PEGASUS, and more.
Let’s take a look at what intent classification is in conversational ai and how you can build a GPT-3 intent classification model for conversational ai and chatbot pipelines.
Discover the capabilities of zero-shot object detection, which enables anyone to use a model out-of-the-box without any training and generate production-grade results.
What is facial expression recognition and what SOTA models are being used today in production
Get a simple TensorFlow facial recognition model up & running quickly with this tutorial aimed at using it in your personal spaces on smartphones & IoT devices.
Explore accurate classification algorithms using the latest innovations in deep learning, computer vision, and natural language processing.
Learn what human activity recognition means, how it works, and how it’s implemented in various industries using the latest advances in artificial intelligence.
What is the the SetFit architecture and how does it outperform GPT-3 and other few shot large language models
What is image classification and how we build production level TensorFlow image classification systems for recognizing various products on a retail shelf.
Explore the application of intelligent document processing (IDP) in different industries and dive in-depth on intelligent document pipelines.
How to build an image classification model in PyTorch with a real world use case. How you can perform product recognition with image classification
Let's build a custom CTA generator that you'll actually want to use for your website copy
We’re going to look at how we built a state of the art NLP pipeline for blended summarization and NER to process master service agreements (MDAs) that vary the outputs based on the input document and what is deemed important information.
Get a comprehensive overview of a purchase order vs. invoice, including when businesses use each, what information goes in them, and more.
Learn what Google Shopping categories are used for and how you can automate fitting products to this taxonomy using ai.
Automatically categorize your Shopify store products to the Shopify Product Taxonomy instantly with ai based PIM software
Dive deep into 3-way invoice matching, including how it works, eight benefits for your business, and the problems with doing it manually.
Smart farming using computer vision and deep learning provides the most promising path forward in the slow-moving industry of agriculture.
How we leveraged large language models to build a legal clause rewriting pipeline that generates stronger language and more clarity in legal clauses
Using ai for document information extraction to automate various parts of the loan process.
Apply AI to your favorite sport with this guide. Learn how automated ball tracking can change the game for coaches and players.
Categorize your ecommerce products to the 2021 google product taxonomy tree instantly with our Ai software
Surveying the current landscape of ecommerce automation and how you can use ai to automate huge chunks of your product management.
Classify your product data against an existing product category database or generate categories and tags in seconds using artificial intelligence
Warehouse automation plays a crucial role across your supply chain. Learn about how machine learning and ai software can be integrated into your warehouse automation stack.
4 different NLP methods of summarizing longer input text into different methods such as extractive, abstractive, and blended summarization
iscover an invoice OCR tool that will revolutionize the way you handle invoices. There’s no human intervention needed & a dramatically lower per-invoice cost.
Instead of invoice matching taking upwards of a week, it could take mere seconds with the proper automation solution. Learn more here.
Manual and template-based invoicing are riddled with low accuracy and required human intervention. Learn how to systematically eliminate these issues with the right invoice data capture software.
A complete walkthrough guide on how to use visual search in ecommerce stores to create more sales and real examples of companies already using it.
Automating the extraction of data from invoices can reduce the stress of your accountants by finding inaccuracies, digitizing paper invoices, and more.
How you can optimize email marketing campaigns with machine learning based models that improve conversion & click-through rates.
How you can use machine learning based data matching to compare data features in a scalable architecture for deduping, record merging, and operational efficiency
Learn how lifetime value or LTV prediction can improve your marketing strategies. Then, discover the best statistical & machine learning models for your predictions.
A deep understanding of how we use gpt-3 and other NLP processes to build flexible chatbot architectures that can handle negotiation, multiple conversation turns, and multiple sales tactics to increase conversions.
The popular HR company O.C. Tanner, which has been in business since 1927 and has over 1500 employees, was looking to research and design two GPT-3 software products to be used as internal tools with their clients. GPT-3 based products can be difficult to outline and design given the sheer lack of publicly available information around optimizing and improving these systems to a production level.
We’ll compare Tableau vs QlikView in terms of popularity, integrations, ease of use, performance, security, customization, and more.
With a context-aware recommender system, you can plan ways to recreate some of the contextual conditions that persuade them to buy more from you.
We’re going to walk through building a production level twitter sentiment analysis classifier using GPT-3 with the popular tweet dataset Sentiment140.
Find out how machine learning in medical imaging is transforming the healthcare world and making it more efficient with three use cases.
Discover ways that machine learning in health care informatics has become indispensable. Review the results of two case studies and consider two key challenges.
Accelerate your growth by pivoting key areas of your business to AI. Your business outcomes will be achieved quicker & you’ll see benefits you didn’t plan for.
We built a GPT-3 based software solution to automate raw data processing and data classification. Our model handles keyword extraction, named entity recognition, text classification | Case Study
We built a custom GPT-3 pipeline for key topic extraction for an asset management company that can be used across the financial domain | Case Study
5 ways you can use product matching software in ecommerce to create real value that raises your sales metrics and improves your workflow operations.
Data mining and machine learning in cybersecurity enable businesses to ensure an acceptable level of data security 24/7 in highly dynamic IT environments. Learn how data security is getting increasingly automated.
Product recognition software has tremendous potential to improve your profits and slash your costs in your retail business. Find out just how useful it is.
Big data has evolved from hype to a crucial part of scaling your organization in every modern industry. Learn more about how big data is transforming organizations and providing business impacts.
Learn how natural language processing can benefit everybody involved in education from individual students and teachers to entire universities and mass testing agencies.
Here’s how automated data capture systems can benefit your business in some key ways and some real-life examples of what it looks like in practice.
Use these power ai and machine learning tools to create business intelligence in your marketing that pushes your business understanding and analytics past your competition.
We built a custom ML pipeline to automate information extraction and fine tuned it for the legal document domain.
In this practical guide, you'll get to know the principles, architectures, and technologies used for building a data lake implementation.
Find out how machine learning in biology is accelerating research and innovation in the areas of cancer treatment, medical devices, and more.
An enterprise data warehouse (EDW) is a repository of big data for an enterprise. It’s almost exclusive to business and houses a very specific type of data.
Dlib is a versatile and well-diffused facial recognition library, with perhaps an ideal balance of resource usage, accuracy and latency, suited for real-time face recognition in mobile app development. It's becoming a common and possibly even essential library in the facial recognition landscape, and, even in the face of more recent contenders, is a strong candidate for your computer vision and facial recognition or detection framework.
Learn how to utilize machine learning to get a higher customer retention rate with this step-by-step guide to a churn prediction model.
Machine learning algorithms are helping the oil and gas industry cut costs and improve efficiency. We'll show you how.
We’ll show you the difference between machine learning vs. data mining so you know how to implement them in your organization.
Here’s why you should use deep learning algorithms in your business, along with some real-world examples to help you see the potential.
Beam search is an algorithm used in many NLP and speech recognition models as a final decision making layer to choose the best output given target variables like maximum probability or next output character.
Best Place For was looking for an image recognition based software solution that could be used to detect and identify different food dishes, drinks, and menu items in images sourced from blogs and Instagram. The images would be pulled from restaurant locations on Instagram and different menu items would be identified in the images. This software solution has to be able to handle high and low quality images and still perform at the highest production level, while accounting for runtime as well as accuracy.
Deep learning recommendation system architectures make use of multiple simpler approaches in order to remediate the shortcomings of any single approach to extracting, transforming and vectorizing a large corpus of data into a useful recommendation for an end user.
Let's take a look at the architecture used to build neural collaborative filtering algorithms for recommendation systems
GPT-3 is one of the most versatile and transformative components that you can include in your framework, application or service. However, sensational headlines have obscured its wide range of capabilities since its launch. Let’s take a look at the ways that companies and researchers are achieving real-world results with GPT-3, and examine the untapped potential of this 'celebrity AI'.
How to get started with machine learning based dynamic pricing algorithms for price optimization and revenue management
Let's take a look at how you can use spaCy, a state of the art natural language processing tool, to build custom software tools for your business that increase ROI and give you data insights your competitors wish they had.
The landscape for AI in ecommerce has changed a lot recently. Some of the most popular products and approaches have been compromised or undermined in a very short time by a new global impetus for privacy reform, and by the way that the COVID-19 pandemic has transformed the nature of retail.
Extremely High ROI Computer Vision Applications Examples Across Different Industries
Building Data Capture Services To Collect High ROI Business Data With Machine Learning and AI
Software packages and Inventory Data tools that you definitely need for all automated warehouse solutions
Inventory automation with computer vision - how to use computer vision in online retail to automate backend inventory processes