Data Cleansing & Categorization

Use Artificial Intelligence in procurement

Perfect data quality lays the foundation for successful spend controls.

Wouldn’t it be great to have instant access to current, valid, reliable numbers or to know the bundled spend volume for individual suppliers and categories – across all business departments and subsidiaries? Our software provides the transparency and reliable numbers that you need to make that dream a reality.

Data management in procurement

Why is perfect data quality essential for stringent spend management and controls?

Each successful spend is the result of a long, arduous process. Most large, international corporations, for example, start off by identifying the bundling potential. This means recognizing similar goods and services across all divisions, departments, and companies in order to harmonize them as much as possible.

The main challenge in procurement is cleansing and consolidating this information, which stems from many different sources. Once this data is comparable, it can be presented transparently across the enterprise in a common procurement language (e.g. eClass or UNSPSC).

The complexity of a project like this, however, should not be underestimated. The source data stems from many different systems with their own languages, classification systems, taxonomies, and quality standards (see figure). Numerous data errors as well as price/volume outliers need to be identified and cleaned up as well.

Once the potential material and service bundles have been identified and specified, the next steps are consolidation and clustering. These clusters produce standardization options and larger volumes which reduce the price per unit. In anticipation, procurement professionals inform potential suppliers, request offers from the most qualified of them, and enter several rounds of negotiation with the two or three best suppliers.

Once a successful contract has been signed, many strategic procurement departments view the successful spend as a sure thing. Experience shows, however, that is often not the case. A negotiated umbrella contract is no guarantee all internal customers follow these guidelines. It also does not ensure that the negotiated purchase request volume suffices over a longer period of time. It is not uncommon for suppliers to ignore the negotiated prices and processes. This is one reason why companies generally do not generate the savings that they anticipated.

group structure

DataCategorizer is comprised of the modules Data Cleansing, Data Consolidation/Clustering, and Data Classification

Orpheus changes all that to ensures that companies achieve their long-term, anticipated savings. With its products SpendControl and DataCategorizer and a modern spend management infrastructure, Orpheus measures the successful spends and savings.

One key to achieving savings potential is creating transparency across the entire spend volume – and, therefore, the transparency and comparability of spend data across all divisions and companies. This goes far beyond order analysis to provide a detailed view of all invoices and payments to major suppliers.


Which employees are buying which goods and services at what conditions worldwide? This question is decisive, but the data quality in most information systems is too poor to provide any insights. Sophisticated data cleansing processes (e.g. price and volume outliers, false exchange rate conversions, or units of quantity) are necessary to find the answers.


If the master data is gathered from different (heterogenous) data sources, it is to be consolidated, clustered and brought into an hierarchical structure. Supplier duplicates can then be detected and similar suppliers can be grouped. The clustering of similar material master data is a requirement for standardizations etc.


Master data clustering is one criterion for perfect transparency. Another is classifying as many invoices, orders, etc. as possible. Orpheus DataCategorizer uses semi-automated classification methods to assign these and other types of transactional data to harmonized spend categories. This ensures comparability and transparency in strategic procurement.

Connect with our experts