Course Description

In today’s data driven business environment, bad data can lead to bad decisions. As such, automating data cleansing is an essential step towards providing the clean and trusted data for timely business decisions.

This course helps individuals recognize, identify, and automate cleansing of quality problems.

Course Outline

The Cost of Bad Data

The course starts with specific horror stories and continues with common statistics on the impacts of bad data. The group discusses their own bad data stories. This sets the stage for the importance of data quality.

Data Quality Taxonomy

Learners are introduced to a taxonomy for understanding the type of data quality problems. Hands on interaction with data sets will help solidify the connection of these terms with common data quality problems.

Data Quality Metrics

Methods to measure data quality will be reviewed.

Open Source and Commercial Tools

Review available open source options. Examine analyst reports on commercial data quality tools, and understand what is being offered by some of the leading commercial vendors. Followed by a group discussion of their experiences.

Data Profiling

Discussing profiling methods, and reviewing a profiling report will emphasize what can be discovered by data profiling. Utilizing a profile tool in a hands-on lab, learners will profile data and identify potential problems.

Data Cleansing

From basic standardization to data enhancement, we will first discuss methods for data cleansing and then implement some of these methods in a hands-on lab.

Monitoring Data Quality

The work of keeping data clean is in general a continual process. After discussing methods for tracking automated data cleansing routines, learners will implement a report for monitoring data quality in a hands-on lab.

Learner Outcomes

At the end of this program, learners will be able to: 

  • Adapt these approaches to recognize and automate remediation of common data quality problems.
  • Maximize the effectiveness of their data governance by helping select and utilize tools for data quality.

Enroll Now - Select a section to enroll in

Classroom: Instructor Led
9:00AM to 4:00PM
Oct 08, 2019
Schedule and Location
Contact Hours
Delivery Options
Course Fee(s)
DATA77 non-credit $695.00
Section Notes

Enrollment Deadline is Tuesday, October 1st at 5 PM.  Beyond this date, please call 314-935-4444 to register.

Parking, lunch and refreshments are provided.


A full refund will be given when a registrant cancels more than five business days prior to the start of the class.  Cancellations received within 5 business days of the start of the class and no-shows will be billed in full.  Another person may be substituted at any time at no additional charge. 

Required fields are indicated by .