A Deep Guide to Text-Guided Open-Vocabulary Segmentation
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Natural language processing is the study of our natural language through speech or text, and how it can be manipulated or understood by software and algorithms. Studying NLP has been around for a long time, more than 50 years, and has gained interest from deep learning and data science developers with how we can build models to produce, manipulate and understand text.
While NLP picked up with a focus on using statistics and classical linguistics models to process our language, we have turned a corner and now use deep learning neural networks as the focus point for NLP based systems. Many large research groups have focused on building large models already trained on billions of words (like GPT-3), while many other architectures come with a few smaller pre-trained models but allow you to easily and efficiently train the model on whatever you want to use it for (Bert/spaCy/Tensorflow). One of the most important parts of NLP is processing and deriving insights from unstructured text data for other ML use cases, a task that can be daunting without NLP.
SpaCy is an open source library for enterprise grade NLP in python. SpaCy stands out amongst some of its competitors (NLTK) because of its cutting edge abilities and is designed specifically for production use. This tool is perfect for answering questions based on text like - “What companies are mentioned in this article?”, “which paragraphs are similar to each other?”, “Where do I add my credit card?” as well as perform tasks like text classification. SpaCy comes with many pre-trained models in all different languages that are amazing out of the box, but also allows you to train custom models on your own data to optimize for your specific use case.
SpaCy allows you to use a processing pipeline to move from raw text to the final “Doc”, which lets you add different pipeline components to your NLP library and act on your input. Things like a tokenizer, tagger and parser act on the Doc. You can also add things like statistical models and pre-trained weights for different tasks, or use built-in custom components.
With the wide range of capabilities of spaCy you can imagine there are a ton of business oriented components that allow you to build simple or complex software tools that can help you increase product ROI, assist you with customer service, or reduce your manual workflow - saving you a ton of money. Let’s first take a look at a few important capabilities of spaCy in a little more detail, then dive into some business tools to build.
A combination of the transformer based pipelines and a statistical NLP model can be used to make predictions about what part of speech a word or sentence is part of using part of speech tagging. For instance understanding words like “looking” or “buying” are verbs, while “apple” is a noun. This feature also uses context clues in the text to understand words and what the context is. For example words following “the” are most likely to be a noun.
Named entity recognition is used to recognize various types of real world objects in paragraphs or documents, such as personal names, companies, countries, and types of items. The method looks to classify named entities in unstructured text into said categories. Many models are trained on billions of lines of text and usually require a little more tuning for your use case, but do achieve over 90% accuracy on most test datasets.
Taking a look at word vectors, or the multi-dimensional representations of words, we can compare these vectors to understand how closely related two words are in their meaning and usage. You might have already heard of this concept of word vectors, as they are popularized by the neural network based algorithm word2vec. Graphically these vectors look similar to anything you’ve seen vectorized, like images. You’ll see that the higher dimensional representations of similar words (for example: queen, princess, lady) will actually cluster together and allow you to see that your graphical space understands which words are similar. We’ll see this concept used quite a bit in our business projects with things like recommendation systems or automation tools.
Let’s dive deeper into talking about similarities with words and representations, as this will be important to understand. The concept of similarity when referring to words is extremely complicated, for a bunch of reasons. Words can be related to each other in a lot of different ways, so using a simple similarity score can be difficult to apply to generalized tools. One person might consider the word “nurse” similar to “dentist” as they are both referring to workers in the medical field. On the other hand one might consider them not similar in an application
given they are considered different professions. Part of this difficulty to generalize a similarity matrix model can be reduced with tuning specifically for the application as well as affecting the way words are evaluated. Instead of evaluating based on the average of the vectors you can include the order of the words to enhance the understanding of meaning.
SpaCy lets you use a bunch of transfer and multi task learning workflows from other natural language processing libraries like BERT to improve accuracy for your pipeline. Using spaCy these techniques let you import learned knowledge from other tools directly into your pipeline, so your custom model can generalize better.
spaCy has a tool called the Matcher, that allows you to implement rule based matching on tokens and dig deeper into the relationships between tokens by looking at things like the surrounding tokens or plural forms of words. We can also create patterns that let you build rules that have multiple flags like, 1. A token must be all lowercase 2. A token whose flag IS_PUNCT is True 3. The lowercase form must match the work “hello”. The Matcher is very useful for custom models that are looking to parse out a lot of what is seen in the unstructured text.
Inflectional morphology is the process of adding morphological features to a lemma to create a surface for a word, for instance changing read to reading. The action does not change the original words part-of-speech, but does linguistically allow us to use the word in a different meaning. We will use the reverse of this process to understand context of objects (words), and useful things like mood, tense, and verb form.
This spaCy feature allows us to assign any morphological features to lemma through a rule based approach, similar to what we talked about earlier with rule-based matching. Using the token text and part-of-speech tags, we can add or remove morphological features.
Lets take a look at some of the state of the art tools people have built using specifically spaCy, as well as some general tools we can build that optimize and perform better with spaCy as the main component, instead of other natural language processing models.
One of the most widely known use cases for NLP systems is chatbots. Chatbot use across industries has accelerated quite a bit in the past year due to covid. The future growth of the chatbot industry is exponential, with the sector expected to be valued at $9.4 billion by 2024, up quite a bit from $2.6 billion this year.
Industrial strength chatbots are great at managing customer relations, with tools like customer support chat boxes on websites or facebook, to automated call centers that try to solve problems without needing human interaction. However, in recent years chatbots have started to be used for a different use case, this one directly increasing ROI and conversion rates for companies selling things online and offline. We’ll look at how to accomplish both of these, but let's first look at a spaCy tool that gets us there.
ChatterBot is a spaCy based conversational dialog system built using python. This NLP library allows us to build and train chatbots easily with its conversation based training style. Every time we put in a statement the library saves the text, as well as the response it outputs as training. As we continue to add more input statements to the model the accuracy of the responses increases, relative to the increase in statement objects.
ChatterBot is language independent, meaning it is incredibly easy to train it to speak any language. An untrained instance of ChatterBot starts off with essentially no knowledge of how to respond to questions or what to do to communicate. This makes Chatterbot very easy for us to train for our specific business use case, but difficult to generalize if we were looking for that.
Building customer support tools with chatbots is one of the fastest ways to reach a level of automation in your business, for anything from online retailers to SaaS platforms. These chatbots are trained to answer questions related to your products or platform like “how do I add more users to my plan?” or “how many emails per month comes with the silver tier?”. They can be trained on your FAQs or simply your most asked questions, and answer all different formats of questions and in all different tones of voice. Many newer models can suggest relevant answers to questions before the customer finishes, as well as frequently asked questions that are close to their question (going on the silver tier question above, the bot could also answer - “Silver tier allows you to have 4 users to send those emails”).
These customer service chatbots help reduce workload of customer service reps and info@ email response teams that deal with countless inquiries per day. This lets you keep less people going at the same time saving you time and resources. Sales people can actually get back to generating you leads instead of having to help with customer support off and on, which you see at small startups. Currently bots are very good at answering simple questions and augmenting these interactions to keep the customer moving through to the answer without human input. Companies are seeing the benefits of using these bots to reduce workload, with Amazon implementing a huge customer service chatbot system in early 2020.
Chatbots are being used more and more on the offense in many large companies, with a clear way to increase ROI and conversions with new or existing customers. Companies such as IKEA are gathering customer feedback automatically using these bots, by asking simple yet targeted questions about products and service. This lets you optimize parts of your sales funnel or product that your customers are looking for, without sending spammy emails or asking them to go to another link and fill out a form (which they won’t do). These chatbots are simple and elegant and get filled out much more often than the above.
Chatbots are great at assisting confused (and likely to leave) customers through a buying process or funnel. They can offer quick assistance for simple questions that are asked frequently OR can offer to answer questions the user might have strictly based on how they interact with the page. Tools like Heatrr.ai pair well with a chatbot and give you behavioral analytics into what users do on a page like a long form sales letter and can help the chatbot ask questions to keep the user moving through.
Tip: Named entity recognition can be useful here, as it allows us to look for specific product names or SKUs that the user is struggling with.
Sentiment analysis using spaCy is a great way to collect insight rich information about your products or brand from a ton of different sources like emails, social media, and product reviews. The information you extract from raw text for sentiment analysis can be helpful in predicting customer trends in the future, as well as make adjustments to your brand right now based on how customers feel about you and your products. Huge companies like Intel, Twitter and IBM are using sentiment analysis right now to analyze huge chunks of data.
Using spaCy’s large english language model we can train a model to analyze social media posts or tweets to understand what people are saying about our product, or even better our competitors and see what they don’t like about it. SpaCy makes it easy to build and train this model with its pipeline design, as well as the fact that we just have to tune the model for our company specifics. From the example above of looking at our competition, not only can we figure out if they’re unhappy with what they got from the competitor (and maybe we offer them a discount if they try our product), but we can extract and analyze certain words that we might want to think about in regards to ours. If we use rule-based matching to extract words like “cheap” “low quality” “fell apart” we can evaluate the business decisions we are making and make improvements.
SpaCy can allow us to tune our model to look for tweets or posts about our system or website, and respond to outages faster. Let's say we have a model looking for negative tweets that include our support account, we can find website outages faster and make fixes for our customers. Imagine how nice it would be to get one of these alerts the second your users are frustrated “system glitched and it used 2 email credits instead of one, WTF @companysupport. Disappointing - might have to make the switch!”. Not only would you be able to manually respond to this user and fix the error, but if you have also implemented a chatbot you could respond with that as well.
Finding negative posts from your competitors customers to improve your product quality isn’t the only way you can use this valuable spaCy model. Building a database of people who are unsatisfied with a competitor is a great way to find super warm leads for your product. These people are already using a product they need, and are unhappy with their current situation. You can offer them a small discount and high quality customer service to join your platform, and these potential customers convert at a very high rate. Adding in certain rule-based matches such as “done with them” or “new product” can help you find even hotter leads that could be some of the easiest conversions you’ve ever had.
Named entity recognition (NER) is one of the most interesting out of the box tools spaCy provides, the ability to recognize things like people, companies, prices, and products in text can be quite useful. SpaCy based tools like NeuroNER allow us to build very powerful systems using spaCy and neural networks.
The capabilities of spaCy to conduct NER are pretty good right out of the box, but with tuning on data just for your business use it can become incredibly accurate. Plenty of work has been done to extract entity objects from more official text like news articles or blog posts, but informal text like emails or texts can be a little different. These types of text don’t follow grammatical rules and can have lots of errors, as well as the fact that these aren't typically written for a large generalized audience like new articles.
Building a tool that automatically extracts things like email address, phone number, name, company, and prices from emails can allow us to automatically add new leads or prospects to our CMS systems and help us automate part of our sales flow that might otherwise be forgotten about. We find many information points that don’t automatically get saved in CMS systems like discussed prices or product quantity. Deeper text analysis will help us add a short one liner that gives us context for what the emails are about. Trained on pretty basic email datasets with the addition of a dictionary in the training (To help generalize with more words) these models have seen over 90.7% accuracy that lets you automate this part of your lead generation.
Building datasets based on competitors for research or building machine learning models can be a very long process. Digging through product pages, recording prices, SKUs, personal names, and so much more. You can also scrape this data automatically, but it won’t be labeled and you’ll end up grabbing everything your web scraper sees and you'll have to label things later.
Using NER we can decide on the named entities we care about (or labels) and scrape just the important stuff. If we train our spaCy model on information relevant to what we want we can automatically scrape everything we need, with the entities as the labels. This makes market research or model training extremely fast and lets us continue to grab data, which is almost always valuable when it's niched down like this.
What's awesome about spaCy is we can create our own custom entity labels when we tune an existing model's nlp vocab. We can create that SKU label when we train and now our web scraper will look for SKUs and understand what they look like. This can be very useful when we’re scraping very oddball things, for example if we wanted to find the name of sports card sets (they use keys like N28 to describe the year and company) we can do that, as the plain model wouldn’t know what those mean.
Interested in seeing what NLP models can do for your startup or small business? Want to start collecting insights that gives you ROI your competition would beg to have? Let's talk about how we can use Ai and NLP to increase conversions and find potential high value customers hiding in plain site.
Contact us to learn more
Don't let your niche competitors find these NLP tools first.
Discover the power of text-guided open-vocabulary segmentation using large language models like GPT-4 & ChatGPT for automating image and video processing tasks.
Learn how CLIPSeg segmentation, in combination with GPT-4 and ChatGPT, can enable diverse applications from medical image diagnosis to remote sensing.
Can GPT-4 make your life as a finance or banking employee easier? Learn how GPT-4 and NLP can be used in finance to increase revenues and streamline workflows.
A deep dive into how we reached SOTA accuracy in product similarity matching through a custom fine-tuning pipeline that refines the CLIP model for image similarity.
Boost your conversions and sales numbers with NLP in sales using OpenAI's GPT-3 and GPT-4. You can use chatbots to improve customer experience and loyalty.
Explore the use of GPT for opinion summarization through innovative pipeline methods, evaluation metrics like ROUGE and BERTScore, and human evaluation insights. Dive into novel entailment-based evaluation tools for a comprehensive understanding of model performance in capturing diverse user opinions.
Come aboard the large language model revolution with our deep dive on AI21 vs. GPT-3 for business use cases like ad copy generation and math proof generation.
A technical guide to using BERT for extractive summarization on lectures that outperforms other NLP models
Discover how prompt based LLMs like GPT-3 & GPT-4 are transforming news summarization with its zero-shot capabilities and adaptability to specialized tasks like keyword-based summarization. Learn about the limitations of current evaluation metrics and the potential future directions in text summarization research.
Discover the PEZ method for learning hard prompts through optimization, a powerful technique that enhances generative models for image generation and language tasks, improves transferability, and enables few-shot learning
Take a look at how Width.ai built 17 generative ai pipelines for use in the Keap.com marketing copy generation product
A deep look at how recurrent feature reasoning outperforms other image inpainting methods for difficult use cases and popular datasets.
See a comparison of GPT-3 vs. GPT-J, a self-hosted, customizable, open-source transformer-based large language model you can use for your business workflows.
Discover how transformer networks are revolutionizing image and video segmentation, and get insights on modern semantic segmentation vs. instance segmentation.
Discover how the state-of-the-art mask-aware transformer produces visually stunning and semantically meaningful images and how it stacks up against Stable Diffusion & DALL-E for large-hole inpainting
Unlock the full potential of spaCy with this guide to building production-grade text classification pipelines for business data.
We compare 12 AI text summarization models through a series of tests to see how BART text summarization holds up against GPT-3, PEGASUS, and more.
Let’s take a look at what intent classification is in conversational ai and how you can build a GPT-3 intent classification model for conversational ai and chatbot pipelines.
Discover the capabilities of zero-shot object detection, which enables anyone to use a model out-of-the-box without any training and generate production-grade results.
What is facial expression recognition and what SOTA models are being used today in production
Get a simple TensorFlow facial recognition model up & running quickly with this tutorial aimed at using it in your personal spaces on smartphones & IoT devices.
Explore accurate classification algorithms using the latest innovations in deep learning, computer vision, and natural language processing.
Learn what human activity recognition means, how it works, and how it’s implemented in various industries using the latest advances in artificial intelligence.
What is the the SetFit architecture and how does it outperform GPT-3 and other few shot large language models
What is image classification and how we build production level TensorFlow image classification systems for recognizing various products on a retail shelf.
Explore the application of intelligent document processing (IDP) in different industries and dive in-depth on intelligent document pipelines.
How to build an image classification model in PyTorch with a real world use case. How you can perform product recognition with image classification
Let's build a custom CTA generator that you'll actually want to use for your website copy
We’re going to look at how we built a state of the art NLP pipeline for blended summarization and NER to process master service agreements (MDAs) that vary the outputs based on the input document and what is deemed important information.
Get a comprehensive overview of a purchase order vs. invoice, including when businesses use each, what information goes in them, and more.
Learn what Google Shopping categories are used for and how you can automate fitting products to this taxonomy using ai.
Automatically categorize your Shopify store products to the Shopify Product Taxonomy instantly with ai based PIM software
Dive deep into 3-way invoice matching, including how it works, eight benefits for your business, and the problems with doing it manually.
Smart farming using computer vision and deep learning provides the most promising path forward in the slow-moving industry of agriculture.
How we leveraged large language models to build a legal clause rewriting pipeline that generates stronger language and more clarity in legal clauses
Using ai for document information extraction to automate various parts of the loan process.
Apply AI to your favorite sport with this guide. Learn how automated ball tracking can change the game for coaches and players.
Categorize your ecommerce products to the 2021 google product taxonomy tree instantly with our Ai software
Surveying the current landscape of ecommerce automation and how you can use ai to automate huge chunks of your product management.
Classify your product data against an existing product category database or generate categories and tags in seconds using artificial intelligence
Warehouse automation plays a crucial role across your supply chain. Learn about how machine learning and ai software can be integrated into your warehouse automation stack.
4 different NLP methods of summarizing longer input text into different methods such as extractive, abstractive, and blended summarization
iscover an invoice OCR tool that will revolutionize the way you handle invoices. There’s no human intervention needed & a dramatically lower per-invoice cost.
Instead of invoice matching taking upwards of a week, it could take mere seconds with the proper automation solution. Learn more here.
Manual and template-based invoicing are riddled with low accuracy and required human intervention. Learn how to systematically eliminate these issues with the right invoice data capture software.
A complete walkthrough guide on how to use visual search in ecommerce stores to create more sales and real examples of companies already using it.
Automating the extraction of data from invoices can reduce the stress of your accountants by finding inaccuracies, digitizing paper invoices, and more.
How you can optimize email marketing campaigns with machine learning based models that improve conversion & click-through rates.
How you can use machine learning based data matching to compare data features in a scalable architecture for deduping, record merging, and operational efficiency
Learn how lifetime value or LTV prediction can improve your marketing strategies. Then, discover the best statistical & machine learning models for your predictions.
A deep understanding of how we use gpt-3 and other NLP processes to build flexible chatbot architectures that can handle negotiation, multiple conversation turns, and multiple sales tactics to increase conversions.
The popular HR company O.C. Tanner, which has been in business since 1927 and has over 1500 employees, was looking to research and design two GPT-3 software products to be used as internal tools with their clients. GPT-3 based products can be difficult to outline and design given the sheer lack of publicly available information around optimizing and improving these systems to a production level.
We’ll compare Tableau vs QlikView in terms of popularity, integrations, ease of use, performance, security, customization, and more.
With a context-aware recommender system, you can plan ways to recreate some of the contextual conditions that persuade them to buy more from you.
We’re going to walk through building a production level twitter sentiment analysis classifier using GPT-3 with the popular tweet dataset Sentiment140.
Find out how machine learning in medical imaging is transforming the healthcare world and making it more efficient with three use cases.
Discover ways that machine learning in health care informatics has become indispensable. Review the results of two case studies and consider two key challenges.
Accelerate your growth by pivoting key areas of your business to AI. Your business outcomes will be achieved quicker & you’ll see benefits you didn’t plan for.
We built a GPT-3 based software solution to automate raw data processing and data classification. Our model handles keyword extraction, named entity recognition, text classification | Case Study
We built a custom GPT-3 pipeline for key topic extraction for an asset management company that can be used across the financial domain | Case Study
How you can use GPT-3 to create higher order product categorization and product tagging from your ecommerce listings, and how you can create a powerful product taxonomy system with ai.
5 ways you can use product matching software in ecommerce to create real value that raises your sales metrics and improves your workflow operations.
Data mining and machine learning in cybersecurity enable businesses to ensure an acceptable level of data security 24/7 in highly dynamic IT environments. Learn how data security is getting increasingly automated.
Product recognition software has tremendous potential to improve your profits and slash your costs in your retail business. Find out just how useful it is.
Big data has evolved from hype to a crucial part of scaling your organization in every modern industry. Learn more about how big data is transforming organizations and providing business impacts.
Learn how natural language processing can benefit everybody involved in education from individual students and teachers to entire universities and mass testing agencies.
Here’s how automated data capture systems can benefit your business in some key ways and some real-life examples of what it looks like in practice.
Use these power ai and machine learning tools to create business intelligence in your marketing that pushes your business understanding and analytics past your competition.
We built a custom ML pipeline to automate information extraction and fine tuned it for the legal document domain.
In this practical guide, you'll get to know the principles, architectures, and technologies used for building a data lake implementation.
Find out how machine learning in biology is accelerating research and innovation in the areas of cancer treatment, medical devices, and more.
An enterprise data warehouse (EDW) is a repository of big data for an enterprise. It’s almost exclusive to business and houses a very specific type of data.
Dlib is a versatile and well-diffused facial recognition library, with perhaps an ideal balance of resource usage, accuracy and latency, suited for real-time face recognition in mobile app development. It's becoming a common and possibly even essential library in the facial recognition landscape, and, even in the face of more recent contenders, is a strong candidate for your computer vision and facial recognition or detection framework.
Learn how to utilize machine learning to get a higher customer retention rate with this step-by-step guide to a churn prediction model.
Machine learning algorithms are helping the oil and gas industry cut costs and improve efficiency. We'll show you how.
We’ll show you the difference between machine learning vs. data mining so you know how to implement them in your organization.
Here’s why you should use deep learning algorithms in your business, along with some real-world examples to help you see the potential.
Beam search is an algorithm used in many NLP and speech recognition models as a final decision making layer to choose the best output given target variables like maximum probability or next output character.
Best Place For was looking for an image recognition based software solution that could be used to detect and identify different food dishes, drinks, and menu items in images sourced from blogs and Instagram. The images would be pulled from restaurant locations on Instagram and different menu items would be identified in the images. This software solution has to be able to handle high and low quality images and still perform at the highest production level, while accounting for runtime as well as accuracy.
Deep learning recommendation system architectures make use of multiple simpler approaches in order to remediate the shortcomings of any single approach to extracting, transforming and vectorizing a large corpus of data into a useful recommendation for an end user.
Let's take a look at the architecture used to build neural collaborative filtering algorithms for recommendation systems
GPT-3 is one of the most versatile and transformative components that you can include in your framework, application or service. However, sensational headlines have obscured its wide range of capabilities since its launch. Let’s take a look at the ways that companies and researchers are achieving real-world results with GPT-3, and examine the untapped potential of this 'celebrity AI'.
How to get started with machine learning based dynamic pricing algorithms for price optimization and revenue management
The landscape for AI in ecommerce has changed a lot recently. Some of the most popular products and approaches have been compromised or undermined in a very short time by a new global impetus for privacy reform, and by the way that the COVID-19 pandemic has transformed the nature of retail.
Extremely High ROI Computer Vision Applications Examples Across Different Industries
Building Data Capture Services To Collect High ROI Business Data With Machine Learning and AI
Software packages and Inventory Data tools that you definitely need for all automated warehouse solutions
Inventory automation with computer vision - how to use computer vision in online retail to automate backend inventory processes