Blog Posts

Improve Your Product Catalog Optimization with Ai in 7 Easy Steps

Matt Payne
·
September 7, 2021

Optimizing your product data & product taxonomy is one of the best ways to help customers make the most informed decisions surrounding your products. When a potential customer reaches a product page the worst thing that can happen for your conversion rates is they feel a disconnect between the product data they see, such as pictures, descriptions, model numbers, and sizes, and what information would lead to more trust.

Customers also want to feel like the product they’re looking for in their head is easy to find in your catalog. One of the easiest ways to lose a customer searching for a standing desk is to see normal desks populating their search result, even if the titles are extremely close. Customers rarely are interested in searching hard on your store before going to a competitor, as 47% of users give up their search after just one attempt.

Let’s look at a few problems that lead to poorly optimized product catalogs and how you can leverage ai based product catalog optimization to automatically perform product information management tasks with high accuracy. 

Here’s a few definitions to get us started.

What is Product Catalog Management?

catalog management vs product information management

Product catalog management is an internal process focused on organizing and ensuring product data is updated and accurate.

The goal is a system that is standardized across all sales channels that allows you to sort, retrieve, and update your catalog based on fields such as product title, description, prices, SKUs, and vendors with high levels of variance across products or categories.

This becomes especially important for large ecommerce retailers with multiple vendors or multi-seller marketplaces as the input product data can vary quite a bit from seller to seller which can lead to issues when standardizing internally.

What is Product Catalog Optimization?

what is product catalog optimization?

Product catalog optimization is an important product information task focused on optimizing the shopping experience for potential customers with the goal of improving conversion rates and buyer trust. When you’ve captured the interest of potential customers you have to take advantage of it, especially when 79% of your visitors will head to a competitor's site for the same product if they can’t easily find what they’re looking for.

The product catalog optimization benefits go past just on-site conversions and potential customer trust. Optimizing your catalog can lead to more organic traffic through better search engine optimization (SEO) results. Only 0.78% of Google searchers click on a result from the second page, and the #1 ranking page gets 49% of all search traffic.

Here are a number of ways companies are currently optimizing their product catalog.

Categorize Products to Fit Product Taxonomy

Categorize product data into your product taxonomy

Potential customers being able to find the exact products they’re looking for in a search is one of the best ways to keep them from getting frustrated and leaving your site without buying. If a user is searching a “Fresh Food” section looking for bread products that only exist in “Baking” they won’t find what they want in a timely manner. While the bread products are categorized correctly they don’t align well with buyer intent and lead to a reduction in sales.

A well-organized product taxonomy helps us structure the product catalog and available categories in a way that leads to the least amount of “leakage” in buyer intent. Making sure the tree structure makes sense and products are placed in the right categories are both ways to better structure our product catalog. It’s no wonder that Forrester found that poorly architected sites sell 50% less than organized sites with solid product taxonomy. 

Clean & Accurate Data

No potential buyer wants to view a product and see the wrong product title or model number based on what they see in the image. They also probably don’t want to read the description and feel even more confused because it’s too short or too confusing. Earning the trust of your buyers is a key part of building a strong brand and data integrity is a place to start.

Making sure your product data is complete and descriptive is a huge part of product catalog optimization. Customers want to see product titles and descriptions that include key information about the product and answer any questions they might have. Product images should show the full range of product capabilities, designs, and sizing information if necessary.

SEO Optimized Product Data

Product descriptions and titles are two key parts of how search engines evaluate a product page's relevancy to keywords. Making sure duplicate or overlapping content across products is removed helps ensure google ranks you in front of your target audience.

The Problem with Current Product Catalog Optimization (Manual)

The standard process of product catalog optimization comes with a number of downfalls that seem to get worse as the complexity of your online retail business grows. Most of these are due to the busy work and required manual processes that come with many of the tasks that make for successful product cataloging.

The required human effort

Constantly optimizing and updating your product catalog and taxonomy requires a level of manual effort that normally needs full teams devoted to the various tasks. These industry specific tasks require a level of business understanding that makes it much more expensive to hire. Employees have to have a strong understanding of product categorization, quality product data, and even SEO to be able to complete these tasks efficiently and effectively.

Costs rise as your company grows

As you grow your product catalog and further expand your categories these optimization tasks become more challenging and take more time. Optimizing the catalog system for 5,000 products and 100 categories is infinitely easier than for 10,000 products and 200 categories. The increased volume means a larger team, and the increased complexity means more time per task.


Requires an intersection of different teams

Catalog optimization requires knowledge of many different marketing strategies and conversion focused user experience design. The intersection of these different skills often requires different teams to work together to constantly keep up with the catalog optimization process. 

Pulls team members away from other tasks

Many small ecommerce companies don’t have dedicated teams for product cataloging and taxonomy. This leads to pulling key team members into these business performance disrupting tasks and away from main business processes. By the time a team member clears dedicated time for product cataloging the issues might have gone on for too long.

What if we could reduce the resources required for product catalog optimization and solve these issues? Let’s look at how Ai can be used.

Automatically Optimize Your Catalog with Ai in 7 Steps

Ai and machine learning (ML) allow us to automate a huge portion of the manual tasks required for product catalog and taxonomy optimization. These ML algorithms learn relationships between product data, product categories, and high converting product data fields to automate many of the above tasks with high accuracy. Many of these tasks can be accomplished with 95% accuracy.

The reduction of manual labor at scale is the largest benefit of automation. Many of these ai catalog optimization tasks take just a few seconds to run whereas a human operator might take a few minutes. Catalogs with 5,000 or 50,000 products take nearly the same amount of time and no longer take days to complete. Expanding the complexity no longer exponentially increases the hours required to complete catalog tasks as well. Taxonomy trees that are 3 layers deep aren’t quicker to fit than 5 layers deep, whereas human effort increases due to complexity.

We’ve built a product information enhancement tool called Pumice.ai that allows you to automate product data related processes with ai and complete product catalog optimization 15x faster than manual effort can. Let’s look at how you can get started in just a few steps.

Step 1. Gather Product Data

First, you want to gather any product information required for the catalog optimization task. Pumice.ai processes only require product title and description but can leverage other fields such as price, metadata, and more to increase accuracy. Depending on the process you want to complete you might need one of these combinations of product information:

1. Relevant product data that is already optimized for the catalog task, and new product records that need to be optimized.

2. Product data alongside the corresponding metadata and categories.

3. Only the new product records needed for the given task.

Step 2. Gather Product Catalog Data

Different types of product catalog data

To go alongside product information data many catalog optimization tasks require specific data related to categories, metadata, and sales channel information. This data is often used as the “goal” side of many operations relative to the product information. The amount of data required from this step is relative to the different pipelines that we’ll take a look at below.

Step 3. Upload to Pumice.ai or Connect APIs

pumice.ai product similarity with NLP

Pumice.ai allows you to add your data to our platform through an API connection to the pipelines or via a CSV upload. You’ll be asked to format your data into an acceptable format based on the endpoints you’re using. Enterprise customers with custom integrations can leverage their own upload types in the dashboard like direct database access and process automation.

Now that we’ve got our data set up, let’s take a look at a few of the baseline operations available in Pumice.ai. These endpoints can be used as building blocks for different catalog processes when combined. We also build other models into the dashboard for customers seeking the use of other tools Width.ai has built as custom products.

Step 4. Automatically Fit Your Products To Categories and Taxonomy Trees

Our dynamic API endpoint allows you to fit product data into a given taxonomy tree or categories list automatically without existing matches. This machine learning pipeline dynamically best fits the given data without needing to be trained specifically on a given taxonomy tree or categories. This allows you to quickly make changes to your available categories or try new trees without the hassle of collecting data, labeling, and retraining an entire pipeline. 

automated product taxonomy example with pumice.ai
Best fit product categories and product tags based on product data


The dynamic API endpoint is trained on millions of relationships between product information and product categories to learn the underlying relationship between the two. The best fit nature is incredibly valuable to manual resource reduction as tree structure changes can require an entire refit of the product catalog.

Step 5. Compare product data for similarity

product similarity based on product description and title

Product similarity is a powerful endpoint that compares two product records to each other for underlying similarity. This information can be used alongside categories, taxonomy, or metadata to understand how to process new product records into your catalog. Unlike the dynamic API endpoint above, you can best fit products on the fly based on current product information fit.

Step 6. Generate Metadata

GPT-3 generated product tags (read more)

The non-dynamic API endpoint is a generative natural language processing (NLP) pipeline that generates relevant product categories and metadata when given a product title and description. The models are trained on data from ecommerce industry leaders which allows you to leverage the same mappings their taxonomists use!

Step 7. Integrate new data into your product catalog management tool

product catalog management tool pipeline with ai automation

Once you’ve generated the data for your catalog optimization process you can leverage the data by integrating back into your PIM or centralized product data repository manually or through a custom integration. These custom integrations allow you to add another level of business automation and remove more human in the loop steps. Our most popular integrations are:

- Shopify

- Woocommerce

- Google Sheets

- Jasper PIM

- Salsify

- Sales Layer PIM

- Your centralized product data storage system

Start automating your product catalog management today

Pumice.ai is a PIM enhancement platform that leverages our proven NLP and Ai tools to automate product related backend processes in one easy dashboard. To take the baseline performance even further you can leverage the development services of Width.ai to build custom integrations and models directly into the platform. Contact us today to learn more about how you can start using our ai models today!