Product Data Wrangles

The digital bottom line: Customers only buy products that are easy to discover and evaluate. As such, rich, reliable product data is foundational to digital channels. But producing “digital ready” product data is difficult, especially on an industrial scale.

Ecommerce and product marketing require excellent product data.


Our Product Data Wrangles automate data cleaning and enrichment so that you can meet this new standard. Together, we can make product data work effectively across all of your channels.

Onboard New Product Data

Product data often comes from suppliers and customers.  Efficiently onboarding this data is critical to product information management.  Machine learning and automation can map poorly formatted files to your standard schema with little to no intervention required.

Clean and Enrich Attributes

Product data is mostly generated via data entry. As such, it’s rife with inconsistencies and errors.  Cleaning and enriching corrects errors, standardizes attributes and values to your vocabulary, and fills in missing data via classification and cross-referencing.

Automate the Entire Process

Data work at scale is only possible with machine learning and automation. Legacy automation approaches rely on human specified rules. Now, semantic machine learning models such as Natural Language Processing (NLP) can automate any size of product data. In fact, the more data the better!

The Solution – Product Data Wrangles

Product Data Wrangles are API-based capabilities that can be tailored to your business and deployed within days. Our Wrangles plug in easily to any data onboarding or cleaning workflow or eCommerce system. They can even be used inside of a spreadsheet! All of our deployment options are designed to quickly enhance data richness and reliability with minimal changes to your processes.

Extract and Standardize

Product data cleaning starts with extraction and standardization. Extraction means separating out individual pieces of information that often end up in overloaded description fields. Standardization is the translation of  inconsistent labels, terms and categories into your managed corporate vocabulary and standards.  Our Wrangles “mine and align” all available information from raw data, preparing it for more advanced enrichment.

Classify and Categorize

Product classification is more important than ever. It’s crucial to search, it drives analytical insights, and is compulsory for logistics. As product lines expand and product experts retire, automation is the only way to excel at categorization.  Classification is a strength of Natural Language Processing, and our Wrangles show it. They can classify your products into any number of hierarchical and flat schema. Training models on your specific products and categories is fast and extremely effective.  In fact, our models consistently outperform status quo approaches at a small fraction of the time and expense.

Match Related Products

Product similarity is extremely valuable but difficult to assess accurately. Our Matching Wrangles use semantic information and numerical properties to quantify similarity. Similarity scores enable deduping, upsell/cross-sell tagging, SKU spec matching, and item cross-referencing at accuracy levels far beyond rules and “fuzzy” matching.

Product Information Management (PIM) Integration

One of the biggest challenges to PIM adoption is mapping disparate inbound data to a specific PIM configuration. Our Wrangles produce PIM-ready product data mapped to your specific PIM schema.  Thus prepared, upserting data via files or APIs is a breeze.

Contact us today. And if you are a distributor, please see how these wrangles address key distributor data requirements.