Orpheus - Data Cleansing

ERP and accounting systems rarely deliver error-free master and transactional data that you can import into BI systems without problems or time-consuming ETL processing. You will often find customer or supplier names in multiple variations which you need to harmonize or data records that contain wrong entries, information and even spelling mistakes that you need to correct in the text fields. Wrong entries or transformation errors often produce outliers that can later make entire analyses difficult to understand. These outliers, for example, can make it impossible to interpret charts due to the size relationships or cause gaps that you want to interpolate in your numbers.

To address these and similar challenges, Orpheus has developed data cleansing modules to harmonize, clean, change and consolidate master and transactional data. The cleansing methods vary for strings, complete texts, dates or numbers. Best of all, business professionals with no technical skills can use the software which runs without needing complicated ETL tools. Users simply clean the data before the ETL process begins. They import clean instead of noisy and error prone data.

Prior to the data cleansing process, a data profiling tool is responsible for:

  • Recognizing data types and categorizing them as nominal, ordinal and metric data
  • Calculating and visualizing data volumes, intervals and statistical measures
  • Highlighting potential outliers and gaps in your data
  • Calculating and selecting value clusters
  • Analyzing data for correlations
  • ...and much more!!

Orpheus Data Cleansing offers the following functions for nominal and ordinal data (e.g. strings, texts):

  • Break down strings and words into tokens or word parts based on rules
  • Find and replace rules and groups of rules
  • Find and delete rules and groups of rules
  • Identify and delete attributes with strong correlations
  • Visualize your data including before/after results
  • Create different scenarios for rules before they change your data

The following functions are available for numbers:

  • Change value (individual value)
  • Adapt intervals and (if desired) delete affected data records
  • Identify, delete or change outliers
  • Identify and fill data gaps
  • Identify and delete measures or attributes with strong correlations
  • Visualize your data including before/after results
  • Create different scenarios for rules before they change your data

The following delete operations are supported:

  • Delete columns/attributes or mark as deleted
  • Delete individual data records or mark as deleted
  • Delete clusters or mark as deleted
  • Simulate rules before they change the data