8 min read

Product Data Glossary: 50+ Terms Defined

This product data glossary covers the 50+ terms you will meet most often in product information management, ecommerce, and supplier data work.

Ben Adams

Founder

This product data glossary covers the 50+ terms you will meet most often in product information management, ecommerce, and supplier data work.

This product data glossary covers the 50+ terms you will meet most often in product information management, ecommerce, and supplier data work. Each entry has a one-sentence definition followed by a short explanation. Where a term has a dedicated article on this site, the entry links through. The glossary is updated twice a year as new terms become standard.

A

API (Application Programming Interface)

Defined interface for two software systems to communicate, exchange data, and trigger actions in each other.

For product data, APIs are how a PIM pushes records to a commerce platform, how a supplier feed updates pricing in real time, or how a marketplace pulls listings on demand. APIs are more reliable than file-based feeds because they handle errors, retries, and authentication in a standard way. Most modern ecommerce integrations are API-first; older systems still depend on scheduled feeds. See the integrations overview.

Attribute

A specific property of a product, such as colour, size, voltage, or material.

Attributes are the building blocks of product data. A drill might have attributes for voltage, chuck size, weight, and battery type. Some attributes are universal across all products (brand, weight); others are category-specific (chuck size for drills, thread count for sheets). A good catalogue uses a consistent set of attributes per category, so customers can filter and compare products on the same terms.

Attribute extraction

The process of pulling structured attribute values out of unstructured sources like PDFs, images, or websites.

A spec sheet might say "12V Lithium-Ion, 1.5Ah, 0.9kg" in a single sentence. Attribute extraction reads that and returns voltage=12V, battery type=Lithium-Ion, capacity=1.5Ah, weight=0.9kg. Modern AI extraction handles formats that older OCR-plus-regex approaches struggle with. See the overview of AI product data extraction 

Attribute normalisation

Standardising attribute values so equivalent data is represented in a single accepted form.

One supplier writes "12V", another writes "12 volts", a third writes "twelve volt". Without normalisation, your filters break and your duplicate detection fails. Normalisation maps every variant to one canonical value, usually defined in the taxonomy. It is one of the least glamorous and most valuable jobs in product data enrichment 

B

BMEcat

A German-origin XML standard for exchanging product catalogues between suppliers and buyers, common in industrial and B2B distribution.

BMEcat structures catalogue data in a predictable way: products, categories, prices, attributes, and media each have defined fields. Many European distributors require their suppliers to send BMEcat files. Reading a BMEcat file is straightforward; producing one from messy source data is harder, which is where extraction tooling earns its place.

Bulk onboarding

Loading large numbers of supplier products into a system in a single batch, rather than one record at a time.

Bulk onboarding usually means hundreds or thousands of SKUs at once, often from a spreadsheet, BMEcat file, or supplier feed. The challenge is rarely the loading itself: it is making sure the records arrive with consistent attributes, valid categories, and complete data. See the guide to onboarding suppliers faster.

C

Catalogue completeness

The percentage of required attributes that have valid values across your product catalogue.

If a category needs 30 attributes per product and only 24 are filled in, that category is 80% complete. Most teams measure completeness per category and per channel, because Shopify might require fewer fields than a Mirakl marketplace. Completeness is the single most useful data quality KPI, more practical than abstract "data quality" measures.

Category

A grouping of related products within your taxonomy, typically arranged in a hierarchy.

"Power tools > Cordless drills > Combi drills" is a three-level category path. Categories drive site navigation, filters, and reporting. Most catalogues struggle when the same product fits several categories, or when supplier categories do not match your own. Mapping supplier categories to your own taxonomy is part of product data management for distributors  

Channel

A destination where your product data is published, such as Shopify, Amazon, eBay, a marketplace, or a B2B portal.

Each channel has its own rules: required fields, image dimensions, taxonomy, character limits. A product record that is complete for Shopify may fail Amazon validation. Channel-specific feeds let you keep one master record and tailor outputs per channel, rather than maintaining separate catalogues for each.

Classification

Assigning a product to a category in a taxonomy or standard.

Classification can be manual (a person picks the category) or automated (a model suggests a category based on the product’s name and attributes). Standards like ETIM and UNSPSC define the categories themselves, so classification often means mapping a product to an external code as well as your internal taxonomy.

Cleansing (data)

Fixing errors, inconsistencies, and gaps in product data.

Cleansing covers a wide range of work: removing duplicates, correcting typos, normalising units, filling missing fields, stripping HTML from descriptions. It is tedious manual work, and slow at scale. Most teams now combine rules-based cleansing for predictable issues with AI for harder cases.

Content generation

Producing product descriptions, titles, marketing copy, or specifications, often with AI.

Content generation uses a product’s structured attributes plus a brand voice prompt to draft descriptions, bullet points, or feature lists. Done well, it produces consistent copy at scale. Done badly, it produces generic AI text that all reads the same. See AI content generation

Country of origin

A field on a product record stating the country where the product was manufactured.

Country of origin is required for customs declarations, tariff classifications, and many marketplace listings. The value should be a recognised country code (typically ISO 3166-1) rather than free text. Suppliers often supply it inconsistently ("Made in China", "PRC", "China", "CN" all appearing for the same goods), which makes it a frequent normalisation target.

D

DAM (Digital Asset Management)

A system for storing, organising, and distributing media files like images, videos, and documents.

DAMs handle the binary side of product data: photos, swatches, lifestyle shots, video, and PDFs. They track metadata, control access, and feed images to channels. A DAM is not the same as a PIM. Some platforms combine the two, but most retailers run a separate DAM. See PIM vs DAM vs MDM.

Data governance

The set of rules, ownership, and processes that control how product data is created, updated, and approved.

Governance answers practical questions: who owns the colour attribute? Who can change a category? Who signs off a new SKU? Without governance, two teams update the same record in different ways and the catalogue rots. Governance documents tend to be boring, which is why projects skip them and then regret it later.

Data quality

A measure of how fit-for-purpose your product data is, usually broken into completeness, accuracy, consistency, and timeliness.

Data quality is not one number. A catalogue might be 95% complete but full of inaccurate values, or 100% accurate but missing half the required fields. Most teams pick three or four sub-metrics and track each separately. See product data quality.

Datasheet

A document, usually PDF, that lists a product's specifications, dimensions, and technical properties.

Datasheets are the canonical source of truth for technical attributes. They are also where most enrichment work starts: the manufacturer's datasheet contains the values that need to end up in your PIM. Pulling that data out reliably is one of the harder problems in product data extraction.

Digital shelf

The aggregate of all the places customers can find and buy your products online.

The digital shelf includes your own site, marketplaces, retailers’ sites, and search results. "Digital shelf analytics" tools track how your product appears across these places: ranking, content quality, share of search, price competitiveness. The phrase is more common in CPG and consumer brands than in B2B distribution.

DTC

Direct-to-consumer, a sales model where a brand sells to end customers without going through a retailer or distributor.

DTC brands typically run on platforms like Shopify and own the customer relationship end to end. Their product data needs differ from those of B2B distributors: fewer SKUs, more lifestyle imagery, more emphasis on brand voice in copy, less reliance on technical standards.

E

EAN (European Article Number)

A 13-digit barcode standard managed by GS1, used to identify retail products across most of the world.

EAN and UPC (the 12-digit US equivalent) are both subsets of GTIN. The terms are often used interchangeably in conversation, but a system that imports product data needs to handle each format correctly. Most retailers and marketplaces require an EAN or GTIN to list a product.

ERP (Enterprise Resource Planning)

A business system that manages finance, inventory, purchasing, and operations for an organisation.

ERPs (SAP, NetSuite, Microsoft Dynamics, Sage) usually hold the master record for cost, stock level, and supplier. Product data work often starts with extracting basic SKU and pricing data from the ERP and then enriching it elsewhere, because ERPs are not designed to hold marketing content or media. See the integrations overview 

ETIM

A classification standard for technical products, widely used in electrical, plumbing, HVAC, and construction supply.

ETIM defines product classes (for example, EC000123 for circular saw) and the features expected for each class. It is maintained by ETIM International. Major distributors require supplier data in ETIM format. See what is ETIM and the ETIM classification overview

F

Faceted search

A search interface where customers narrow results using filters on attributes like price, brand, voltage, or material.

Faceted search relies on having clean, consistent attribute values. A "voltage" filter only works if voltages are normalised. Sites with poor data either have no filters or have filters with hundreds of duplicate values ("12V", "12 volts", "12-volt"). See faceted search for distributors.

Feed

A structured file (CSV, XML, JSON) sent from one system to another to update product data.

A daily Google Shopping feed, a weekly supplier price feed, an hourly stock feed: these are all common patterns. Feeds are fast and well understood, but brittle. A column added by the supplier without warning breaks the import. Most teams move from feeds to APIs as their integration matures.

Fitment data

Information about which vehicles, machines, or equipment a part fits, common in automotive and industrial supply.

A brake pad might fit 240 specific car models across three years. Fitment data is structured (year, make, model, trim, engine), volumes are large, and accuracy is essential. Customers shop primarily by fitment ("does this fit my car?"), so bad fitment data is a direct sales killer.

G

GDSN (Global Data Synchronisation Network)

A network of certified data pools that lets trading partners share standardised product data, governed by GS1.

GDSN is the backbone of supplier-to-retailer data exchange in grocery, pharmacy, and some hardlines categories. A supplier publishes their data once to a data pool, and authorised retailers pull it down. GDSN solves part of the supplier data problem but only for industries that have adopted it.

GS1

A global standards body that maintains barcodes, GTINs, and product identification standards.

GS1 issues GTINs (Global Trade Item Numbers, often called barcodes), defines GDSN for sharing product data between trading partners, and runs national member organisations in most countries. If you sell through retailers or many marketplaces, you will need GS1-compliant identifiers.

GTIN (Global Trade Item Number)

A globally unique product identifier, including UPC, EAN, ISBN, and JAN formats, all administered by GS1.

GTINs come in 8, 12, 13, and 14-digit lengths depending on the format. Marketplaces, search engines, and retailers use the GTIN as the universal product key. Missing or incorrect GTINs are one of the most common reasons product feeds fail validation.

H

Hallucination (in AI)

An AI model producing a confident answer that is factually wrong.

In product data, hallucinations look like invented spec values, made-up part numbers, or descriptions that mention features the product does not have. Good extraction systems include verification steps to catch these: confidence scores, source citations, and human review of low-confidence outputs. See product data extraction accuracy.

HS code (Harmonised System code)

A six-to-ten-digit code that classifies products for customs and international trade.

HS codes are required on shipping documentation and customs declarations. The first six digits are standardised globally; countries add further digits for tariff specifics. International distributors and exporters need accurate HS codes per SKU, and getting them wrong means duties, delays, and penalties.

Headless commerce

A commerce architecture where the front-end (storefront) is decoupled from the back-end (commerce engine, PIM, OMS).

Headless setups let teams build custom storefronts and channels while keeping the commerce logic central. Product data flows through APIs rather than being tied to a specific theme or template. Headless is the H in MACH architecture.

Hero image

The main, primary image shown on a product detail page or listing.

Hero images are usually the highest-quality shot, taken on a clean background, showing the full product. Standards vary: some marketplaces require a pure white background; some require a specific aspect ratio. Getting the hero image right matters more than getting the rest of the gallery right.

I

Image metadata

The structured information attached to a product image, such as alt text, file name, dimensions, and what the image shows.

Image metadata drives accessibility (alt text), SEO (file names, alt tags), and channel rules (some marketplaces reject images without specific metadata). Most catalogues have some metadata for some images and nothing consistent across the board.

Integration

A connection between two systems that lets data move automatically between them.

Integrations may be API-based (real-time), feed-based (scheduled), or webhook-based (event-driven). For product data, common integrations connect a PIM to a commerce platform, a DAM to a PIM, or a supplier portal to an ERP. See the integrations overview  

J

JSON-LD

A format for embedding structured data in web pages, used by Google to power rich search results.

JSON-LD blocks describe a page's content in machine-readable form (Product, Offer, Aggregate Rating, Review) using the schema.org vocabulary. Search engines read these blocks to display rich snippets like star ratings, prices, and stock status in search results. Good product data feeds JSON-LD directly.

K

Kit

A group of products sold together as a single SKU, often pre-assembled or co-packaged.

Kits are common in industrial supply (a tool kit), trade distribution (a fixings pack), and retail (a gift set). The kit has its own SKU, price, and stock; the components may or may not be separately sellable. Modelling kits in a PIM is harder than modelling single products or variants, because a kit references other products as part of its definition.

L

Lead time

The elapsed time between an order being placed and the product being available to ship.

Lead time appears on PDPs as "available in 3 days", "2-week lead time", or "made to order". Customers buy on lead time as much as on price, especially in B2B. The data lives in ERPs and supplier feeds, but increasingly customers expect to see it on the product page directly.

Line card

A document showing which brands or product lines a distributor carries.

Line cards are typically used in sales: a rep walks into a customer with a one-pager listing the manufacturers they represent. The underlying data (which manufacturers, which categories, which territories) sits in the distributor’s catalogue and is one of the harder parts to keep tidy.

M

MACH architecture

An approach to commerce technology built on Microservices, API-first systems, Cloud-native, and Headless principles.

MACH replaces monolithic platforms with composable best-of-breed components. The selling point is flexibility: swap one component without rebuilding the stack. The trade-off is integration complexity. PIM and DAM systems are usually MACH-friendly because they are already separate from the commerce engine.

Mapping

Translating data from one structure or vocabulary into another.

Common mapping jobs: supplier categories to your taxonomy, your attributes to a marketplace’s required fields, ETIM classes to your internal codes. Mapping is rules-driven for predictable cases and increasingly AI-assisted for the long tail of edge cases.

Marketplace

A platform where multiple sellers list products, such as Amazon, eBay, Mirakl-powered marketplaces, or Faire.

Marketplaces have their own taxonomy, content rules, and required attributes. Sellers either submit feeds or use marketplace tools, but in either case the data has to match the marketplace’s requirements exactly. Failed listings are usually a content-quality problem, not a system problem.

Master data

The core, agreed record for a product (or any entity) used as the source of truth across systems.

Master data is what everything else syncs from. If your PIM is the master for product data, the website pulls from there, the ERP pulls from there, the marketplace feed builds from there. Identifying what is master and what is derived is one of the early decisions in any data programme.

MDM (Master Data Management)

A discipline (and the systems that support it) for managing master data across an organisation.

MDM covers products, customers, suppliers, locations, and other core entities. A PIM is one specific kind of master data system, focused only on products. Most large organisations use MDM tools for non-product entities and a PIM for products. See PIM vs DAM vs MDM.

Metafield

A custom field on a product (or other object) in Shopify and similar platforms.

Standard fields cover title, description, price, and so on. Metafields hold everything else: technical specs, country of origin, warranty period, fitment data. Used properly, metafields turn Shopify from a simple catalogue into a structured product database. Used badly, they sprawl into thousands of inconsistent values.

Migration (data)

Moving product data from one system to another, often as part of a replatform or a version upgrade.

Migrations are rarely just data moves. Field structures differ, validation rules differ, taxonomies differ. The work is in mapping and cleaning, not in copying. Underestimating a migration is the most reliable way to blow a project budget. See ETIM 8 vs ETIM 9.

MPN (Manufacturer Part Number)

A unique identifier assigned by the manufacturer to a specific product or component.

Where GTIN identifies the saleable product to retailers, MPN identifies the part to engineers, distributors, and trade buyers. A drill might have one GTIN and one MPN; an industrial component might have only an MPN. Marketplaces increasingly require MPN alongside GTIN to disambiguate listings.

N

Normalisation

Standardising values so equivalent data is represented the same way.

"12V", "12 volts", "12-volt", and "12 V" all mean the same thing. Normalisation picks one form and converts the others. Without it, filters break, duplicates multiply, and reporting becomes unreliable. Normalisation rules live in the taxonomy and apply at ingest, edit, or export.

O

OMS (Order Management System)

A system that handles order capture, fulfilment, allocation, and post-purchase activity.

OMSs sit between the storefront and the warehouse, often integrating with ERPs, WMS, and shipping carriers. Product data flows from the PIM into the OMS so that order confirmations, picking lists, and packing slips reference the correct attributes (size, colour, variant).

Onboarding

The process of bringing a new supplier, product range, or product into your catalogue.

Supplier onboarding covers everything from collecting the first data file to live SKUs on the website. The bottleneck is usually not setting the supplier up as a vendor: it is getting their product data into a usable shape. See supplier onboarding software.

P

PDP (Product Detail Page)

The page on a website that shows a single product, with its description, images, specs, price, and add-to-cart button.

The PDP is the conversion-critical page in ecommerce. Product data quality shows up directly: missing images, sparse descriptions, unfilled specs, and broken filters all hurt PDP performance. Optimising PDPs without fixing the data feeding them rarely works.

PIM (Product Information Management)

A system for storing, managing, and distributing product data across channels.

PIMs hold product records, attributes, categories, descriptions, and links to media. They are the backbone of multi-channel catalogues. A PIM does not generate or extract data on its own; it stores what you put in it. See product data enrichment vs PIM.

Product data

The structured and unstructured information that describes a product: identifiers, attributes, descriptions, media, classifications, and pricing.

Product data is the asset that drives every commerce channel. Bad product data is not a back-office problem; it directly affects search ranking, conversion, returns, and customer trust. Most catalogues have far worse product data than the team realises until they measure it.

Product data enrichment

Adding missing attributes, improving descriptions, and bringing supplier data up to channel-ready quality.

Enrichment is the work between "the supplier sent us a spreadsheet" and "this product is live with full content on the website". It is increasingly AI-assisted: extraction from PDFs, classification from titles, attribute completion from datasheets. See the product data enrichment overview.

Product record

The full set of fields, attributes, media, and classifications that describe a single product in a system.

A product record might have 30 fields in a small DTC catalogue or 200 in a technical distribution catalogue. The record sits in the PIM and gets pushed out to channels in different shapes for each one.

Q

Quality metrics (data)

The specific measures used to track product data quality, such as completeness, accuracy, consistency, and timeliness.

Quality metrics turn an abstract goal ("better data") into something trackable. Common metrics include the percentage of required fields filled, the percentage of values matching the controlled vocabulary, and the percentage of records updated in the last N days. See product data quality.

R

Rich snippet

An enhanced search result on Google that includes structured information like star ratings, price, or stock status.

Rich snippets are powered by structured data on the page, usually JSON-LD. For products, the most common enhancements are price, availability, ratings, and review counts. Earning rich snippets requires both correct markup and accurate underlying data: either one missing breaks the result.

RoHS

A European directive restricting hazardous substances (lead, mercury, cadmium, and others) in electrical and electronic equipment.

RoHS compliance is a yes/no flag at minimum, sometimes accompanied by a declaration document. Distributors and retailers need RoHS data on every relevant SKU; missing it can block a sale to public sector or large enterprise buyers. The data usually lives on the manufacturer's datasheet and has to be extracted to land in the PIM. See product data management for distributors  

S

Schema

The structure that defines what fields a product record can have, what type each field is, and what values are allowed.

Your schema might say "voltage is a number, between 1 and 1000, optional for category X, required for category Y". Without a schema, anything goes, and "anything goes" is how a catalogue ends up with three formats for voltage. Schema-first thinking is one of the most useful habits in product data work.

SEO

Search Engine Optimisation, the practice of structuring content so it ranks in Google and other search engines.

For product data, SEO shows up in PDP titles, descriptions, image alt text, structured data (JSON-LD), and internal linking. Good attribute data feeds rich snippets and faceted search, both of which support SEO. Product data and SEO are more entangled than most teams treat them.

SKU

Stock Keeping Unit, a unique identifier for a sellable product variant.

A T-shirt in three sizes and four colours is twelve SKUs, even though it is one product. SKUs are usually internal: a GTIN or barcode is the universal identifier. Catalogue size is most often measured in SKUs because that is what you ship, sell, and stock.

Specification

A defined property of a product with a name, value, and often a unit.

Specs are the technical attributes that engineers and trade buyers care about: voltage, torque, IP rating, dimensions. Spec data lives in datasheets, PDFs, and supplier feeds, and getting it into structured form is one of the harder enrichment jobs. See product data extraction.

Standards

Industry-defined formats and vocabularies for product data, such as ETIM, GS1, UNSPSC, and BMEcat.

Standards exist so suppliers, distributors, and buyers can exchange data without negotiating a custom format every time. Adoption varies by sector: ETIM is strong in European technical products, UNSPSC is strong in procurement, GS1 is universal for retail.

Supplier

The manufacturer, brand, or distributor that provides products to your business.

Suppliers send product data in many forms: spreadsheets, PDFs, BMEcat files, supplier portal entries, marketing assets. Quality varies widely between suppliers, which is why most enrichment work happens after supplier data arrives, not before.

Supplier portal

A web-based interface where suppliers submit and update product data themselves.

Portals replace the email-and-spreadsheet pattern with a structured form: required fields, validation, status tracking. Done well, they reduce back-and-forth and shorten time to list. Done badly, suppliers ignore them. See supplier portal vs email onboarding.

Syndication

Distributing product data to multiple channels, retailers, or partners from a single source.

Syndication is what a PIM does on its outbound side: take the master record, transform it into each channel’s required shape, and push it. Syndication tools handle scheduling, validation per channel, and retries when a channel rejects a record.

T

Taxonomy

The hierarchical system of categories used to organise products.

Taxonomies vary widely. A small DTC brand might have 20 categories. A large distributor might have 5,000. Taxonomies drive navigation, filters, search, and reporting. Changing a taxonomy is painful, so most teams over-invest in setting it up early. See taxonomy management  

Template

A pre-defined file structure suppliers use to submit product data.

Templates are the traditional answer to messy supplier data: send suppliers a spreadsheet with the fields you need. They work in theory and rarely in practice, because suppliers ignore them, edit them, or fill them in inconsistently. See supplier data templates for why.

Time to list

The elapsed time from receiving supplier data to having a live, sellable SKU on a channel.

Time to list is the operational KPI for onboarding. Traditional teams measure it in weeks; modern AI-assisted teams measure it in days or hours. Tracking it forces you to identify where the slow steps are, which is usually one or two specific handoffs.

U

UDI (Unique Device Identification)

A US FDA system for uniquely identifying medical devices, with similar systems in the EU and elsewhere.

UDI requires structured device data with specific identifiers, hierarchies, and attributes, submitted to a global database. Medical distributors and manufacturers carry the compliance burden. The data overlaps heavily with general product data work but has its own rules and audit requirements.

UNSPSC

The United Nations Standard Products and Services Code, a global classification system used heavily in procurement.

UNSPSC organises products and services into a four-level hierarchy: segment, family, class, commodity. Public sector buyers and large enterprise procurement teams often require UNSPSC codes on supplier data so that spend can be reported consistently across categories.

V

Validation

Checking that data meets defined rules before it is accepted into a system.

Validation can be simple ("voltage is required") or complex ("voltage must be in the controlled list, and must match the values allowed for this category"). Most data quality problems trace back to weak or absent validation at ingestion. See product data quality.

Variant

A specific version of a product, differing by attributes like size, colour, or finish.

A T-shirt in five sizes and three colours has 15 variants. Variants share most of the parent product’s data (description, brand, category) but differ on a few specific attributes. Variant modelling differs across platforms: Shopify, Akeneo, and Magento each handle it slightly differently.

How this product data glossary is maintained

Want the rest of the SKULaunch product data series?

Subscribe to the monthly product data digest: new glossary terms, fresh case studies, and tactical guides for retail and distribution teams. One email a month, no sales follow-up.

See SKULaunch in action

Watch how we handle AI enrichment, supplier onboarding, and catalogue scale in a live 30-minute demo.

Book a free demo →

IN THIS ARTICLE

Get this in your inbox

Fortnightly. The best thinking on product data ops, straight to you.

Subscribe free

SKULAUNCH PLATFORM

See how it works

Watch AI enrichment and supplier onboarding in a live demo.

Book a demo →
© 2026 SKU Launch Ltd. All rights reserved.
Built for e-commerce teams who are done doing it by hand.