A classification of procurement data – in other words, the purchase vouchers such as invoices and orders – is the second most time-consuming step in the implementation of an efficient Spend Management system.
Voucher classification as a foundation for more transparency
At the same time, it is the most important step when implementing such a system in regards to the quality and the acceptance of the results through the buyers as users of procurement controlling systems. Especially in indirect purchasing, with the so-called “indirect categories” such as all types of services, neither material nor product group information is often available. For these categories a (partially) automated classification of procurement categories or standards, such as eCl@ss or UNSPSC are particularly important.
A significant goal, both for personnel and partially automated voucher classification, is the deflection of rules or patterns that can be used later in the introduction of a sustainable Spend Analysis solution as an initialization or training set for highly-automated processes if required.
Data classification in 3 steps
Step 1: A rough classification of the purchasing volume
Rough classification is used to allocate the entire procurement volume into the first and second level of a (standard) classification schema such as eCl@ss or UN/SPSC with the aid of supplier-, account-, material- and material-group-rules. To be more precise: the invoice and order line items are to be allocated into rough sourcing categories. Only where it is possible to make detailed statements without significant additional work, are the third or even more accurate levels applied. The diagram below illustrates this principle.
The data volume underlying rough categorization can be very high. In order to generate more manageable data packets, packets according to each of the first letters of the suppliers can be made so that someone working on the categorization is able to process all the invoices and orders of one supplier.
Software tools that simplify the filtering and sorting of data sets are used to support the categorization process. The diagram below shows Orpheus’s content management software, DataCategorizer, as an example.
- The real corporate names shall remain anonymous here. This table column represents the first dimension or location of the companies in the Spend Management system.
- In the following columns of the tools, the next dimension of the suppliers (i.e. supplier name) will follow
- The third and last dimension is the procurement group resp. sourcing category. This column will be completed based on information, which we can see further on the right of the screen, by the staff member responsible for the classification:
- purchase order text (ORDER_POS) and
- invoice text (INVOICE_POS)
With modern classification tools, such as in our example, categorizations with rule-based processes or artificial intelligence are pre-allocated, in order to make it easier for the employee to decide.
Other helpful criteria are given by means of the general ledger account and the invoice amount, which could not be shown due to a lack of space on the screenshot at hand.
By glancing at the tool, users can surmise the challenges that arise and need to be mastered in categorization projects such as these. Even though only a selection of the suppliers is shown, the overall volume in this example is greater than 260 million Euros and consists of over 100,000 data sets.
More importantly, these lines feature many data records without order text. These are examples of invoices without any order references. The classification of invoice line must then be based on the invoice text, the invoice amount and the supplier. When applying a filter, the software shows which categories the supplier has supplied to date. This can reduce the number of possible categories in many cases.
With an allocation of approx. 10% of the procurement data, a categorization level of 80% can usually be achieved
The program sorts the invoice line positions according to size. From experience, a categorization of approx. 80% of the relevant procurement volume is possible through allocating approx. 10% of the data sets. In doing so, the user of the program must not allocate all entries individually; but rather use a certain category for many data sets by applying rules that correspond to a pattern. The program supports the user with this.
Please note: It should be taken into account that not all criteria combinations are useful.
- The operator can force all data sets to be allocated into a material master, a commodity group, a general ledger account or a supplier of a particular category.
- It is also possible to determine that all combinations from a range of attributes can be classified. These decisions, once made, can be saved as rules and used at later on with automated classifications.
- Lastly, the operator has the possibility of searching through and classifying the database by means of tag rules.
Compilations, like the one depicted in our example, are distributed among various experts within the scope of rough classification, before being fed back into the database of the Spend Analysis system.
Step 2: Detailed classification of selected categories
The depth of the allocation of an eCl@ss or UN/SPSC level 2 is not sufficient for the targeted processing of specific procurement categories or commodity groups in regards to, for example, negotiating new framework agreements. More detailed specifications are more helpful for this, so that the strategic buyers are able to use this basis to create the specifications of terms with the exact product descriptions, precise indications of quantity and price benchmarks.
The basis of detailed classification is the result of rough classification. The quality of this step essentially determines the acceptance of the evaluation at the end of the process.
The process used to make the classification more detailed is outlined in the diagram below. It highlights commodity groups, which are the aim of procurement initiatives, as an example:
- Tradesmen’s services,
- Building supplies and insulation,
- as well as waste management and industrial cleaning.
Meanwhile, partially automated processes, which aim for a finer allocation based on the rough categories, are being used for detailed classification. The respective processes are rule and text-based algorithms (text mining) derived from artificially intelligence (AI).
Please note: It should be taken into account that the existing knowledge and rule bases at present are in part very sector-specific, especially in regards to direct material. With that said, the proposals can be used although they need a further plausibility check for a Spend Management project. In all cases, it is important to involve local buyers in order to ascertain the relevant “expert knowledge”. The existing proposals of the knowledge bases can hereby definitely serve as a basis for discussion.
Savings effects of approx. 8-20% attainable
It is advisable to ascertain the pertinent suppliers, as well as the most important products and services in each category from the procurement experts. One then searches for these in the Spend Cube and separates specific data parcels. The category specialists subsequently try to discern recurring patterns and to classify the data in detail based on the information available (order and invoice text, commodity groups, general ledger account categorization, etc.).
Several validation rounds, in which category experts from purchasing are confronted with the classification results in order to gradually increase the quality or classification quality, are normal in this process. The average volume throughput (categorized procurement volume in Euros per hour) is less, since the quality requirements of detailed classification are considerably higher than for rough categorization. Careful processing, however, is still worthwhile for larger categories, because cost savings effects between 8 and 20% can be reached in regards for newly negotiated framework agreements.
The category experts of the detailed classification have the job of gather and formalize rules for the sustainable security of knowledge in their interviews and validation rounds with the buyers.
The objective is to establish and to regularly expand a company-specific knowledge base for the categories. With the help of this course of action, the specific terminology can be translated into numerous languages, so that knowledge, once obtained, can be transferred to other subsidiaries or cultural environments within the corporation.
The workshops and interviews with the corporation’s procurement experts are crucial in order to initialize automated classification tools that are used later on as optimally as possible. To make this possible, the quality of the original data must be impeccable. Only then can a high acceptance be guaranteed for further actions concerning the increasing automation.
Step 3: Plausibility check of the classification and automation
After the raw data has been consolidated, the cleaned up data flows to the various corporations for inspection, so that further data quality improvements can be made. It is crucial for the acceptance of the entire procurement controlling system that the results are tested and improved if needed by the buyers responsible before publication. The purchasing analyses and reports give insight into the performance ability of the procurement organizations; that is why the purchasing managers should always be involved. For the inspection of the results, a data specialist extracts evaluations from each company’s procurement database that underlies procurement controlling, and distributes these to the contact persons. Care should be taken that this data is provided within a standard tool and in a readable format, in other words with a suitable level of detail. MICROSOFT EXCEL data sheets, in which the lists have drop-down menus, were the most suitable for this.
The following evaluations are examples of testing procedures or validations:
- Comprehensive annual invoice and order volume: The buyer should check whether the calculated overall volume of all invoices and orders within the corporation meets the expectations and is based on experience.
- Annual turnover of the 50 largest suppliers: As a rule, the corporations’ buyers are familiar with their top suppliers. On the list of the Top 50 suppliers, it needs to be clarified whether they are actually situated at the assumed position and whether the volumes are plausible.
- Random samples from detailed classified data: It is possible to significantly reduce the number of data sets with an extraction of approx. 80% of the invoice volume. The buyers check whether the intended classifications are correct, and reclassify these if required.
- Establishing rules for the most important data patterns: Part of the test should be a workshop with a data specialist, so that the most important rules in each corporation and category are understood. This makes it possible to transfer buyer knowledge to the classification databases for data clean up at a later time.
The diagram below depicts an example of an Excel file as mentioned above. It lists the suppliers, the order and invoice text, as well as a proposed categorization. Beyond that, it shows the cumulative invoice amount.
The buyers return the corrected Excel file or data extraction to the transparency team at the end of the processing period. This team, in turn, incorporates the data into the Spend Analytics database, thereby transferring intended changes all the way to the voucher level. The established rules reach the knowledge base, where they are then available for regularly executed clean ups at a later time.
A high degree of automation after training steps 1-3
As soon as Steps 1-3 have been initially run through, the text mining algorithms and classification rules have been “learned”. Normally, approx. 50-70% of the vouchers are categorized within the three steps of the training phase. The remainder is done fully automatically. The same applies to the purchasing vouchers (invoices, orders) that will now be added on a monthly basis. Their categorization will also be fully automated.
We offer you a holistic solution for increased transparency, potential analyses and measuring your procurement success regarding indirect purchasing with our DataCategorizer, SpendControl and InitiativeTracker modules.
Request an online presentation
In this blog you will find expert knowledge about initiative management and project management in strategic procurement. InitiativeTracker is a software for planning and monitoring your sourcing initiatives, controlling your procurement organization and measuring your procurement performance.