Loading...

Course Description

Kimball indicated in 2009 that, “In today’s environment, most organizations should use a vendor-supplied ETL tool as a general rule.” Most organization now recognize the ROI of commercial ETL tools far surpasses that of hand coding.

This course applies Kimball’s reasoning to big data, allowing one to see why commercial Big Data Fabrics surpass hand coding for big data jobs.

Course Outline

The Cost of Hand Coding

Our discussion starts with common statistics on the negative impacts of hand coding. This sets the stage for the importance of commercial tools that support continuous integration and continuous development in a big data environment.

Big Data Fabric Taxonomy

Learners are introduced to the common components and architecture of a Big Data Fabric. Hands on interaction with data sets will help solidify the connection of these terms with solving big data problems.

Open Source and Commercial Tools

Review available open source options. Examine analyst reports on commercial Big Data Fabric tools, and understand what is being offered by some of the leading commercial vendors. Followed by a group discussion of their experiences.

Big Data Integration

Review of big data methods for dealing with batch and streaming data from course Data 107. Examine how these methods are implemented in hands on exercises using Talend.

Big Data Quality

Review of methods for dealing with data quality problems from course Data 77. Examine how these methods are implemented in a big data environment with hands on exercises using Talend.

Machine Learning with Big Data

Review of data science modeling methods, including some from course Data 99 and Data 98. Hands on exercises will deal with such data models available as components in Talend.

Learner Outcomes

At the end of this program, learners will be able to:

  • Adapt these approaches to recognize and deal effectively with common big data integration and big data quality problems.
  • Implement existing machine learning components in a big data environment.
  • Maximize the effectiveness of their big data platform by helping select and utilize a commercial big data fabric.

Recommendations

While not required, those who have taken DATA77 and DATA99 will gain more insights from this course.
Loading...

Enroll Now - Select a section to enroll in

Type
Classroom: Instructor Led
Days
T, W
Time
5:30PM to 9:00PM
Dates
Oct 08, 2019 to Oct 09, 2019
Schedule and Location
Contact Hours
6.0
Location
Delivery Options
Classroom  
Course Fee(s)
Tuition non-credit $695.00
Section Notes

Enrollment Deadline is Tuesday, October 1, 2019 at 5 PM.  Beyond this date, please call 314-935-4444 to register.

Parking, lunch and refreshments are provided.

CANCELLATION POLICY

A full refund will be given when a registrant cancels more than five business days prior to the start of the class.  Cancellations received within 5 business days of the start of the class and no-shows will be billed in full.  Another person may be substituted at any time at no additional charge. 

Required fields are indicated by .