Step 1: Automated matching
Each master data record is compared field by field with the contents of other relevant data records. Each match is given a score, which ideally adds up to 100% if all fields are identical. This method identifies several suggestions, which may have varying scores, for each supplier record that is checked. Suggestions with a strong match – for example, 90 % or higher – can be accepted automatically and linked to the reference data record. The other automatic suggestions – for example, those with a score between 80 % and 90 % – flow into step 2.
Step 2: Scanning and browsing
A data operator at Orpheus (or the client) reviews the data records where the suggestions have an acceptable quality (e.g. an 80 % - 90 % confidence rate) for possible matches. The data operator marks the selection that seems most logical or rejects the suggestion without making a match. This semi-automated matching significantly increases the hit rate from step 1 with relatively little work. We compare this to the 80-20 rule. Running this second step, in other words, is definitely worth the effort.
This consolidation is a two-step process. Step 1 compares two supplier data records to identify and group any duplicates. Step 2 attempts to assign subsidiaries (i.e. children) to the respective “parent” companies.
DataCategorizer can apply different comparison strategies to identify duplicates. Suppliers can, for example, be viewed as the same if their names and addresses match. Alternatively, it can also compare the records using the Data Universal Numbering System (DUNS). The figure on the right illustrates these two variants. Users, however, can make these comparisons using any number of attributes that are available.
Creating master data hierarchies
This process consolidates different forms or spellings of a supplier or materials into groups (clusters) and arranges them into multi-leveled, tree-like structures.
This process, for example, creates parent-child relationships among suppliers. A reliable match, however, is not always possible without further information outside the system. Many companies that are subsidiaries of a corporate group have completely different names. ATRADA AG, for example is a subsidiary of TELKOM AG. Neither the name nor the address of both companies, however, provides any clues about this relationship. The only option here is to check the list of subsidiaries, for example, by researching the internet or contracting an expert in this field.
Parts and services can also be consolidated into equivalence classes. Product families of certain vendors include, for example, different models of computers, furniture, or vehicles. Allocating these parts into hierarchies supplements the classification based on a standard schema (e.g. eCl@ss or UN/SPSC). The benefits of analyzing this parallel dimension, however, rarely justify the means due to the massive amount of time and effort.
Creating master data hierarchies and processing automatic suggestions manually
The consolidation and cluster algorithms of DataCategorizer are designed to automatically sort, consolidate, and cluster a maximum number of master data records. From the client application, users can validate the server’s suggestions, manually arrange individual data records in the hierarchy, and examine the data records that could not be matched automatically.
Let our professional service team manage your spend controls for you.
Expert knowledge about procurement controlling, initiative and data management.
Please download the brochure DataCategorizer to gain a first impression.
Get responses to questions
about big data & AI directly
from our experts.
+49 911 / 149 913 41