FREE DM Review Site Registration!
Sign-up today and access DM Review on the Web!

Your FREE registration entitles you to:

FREE email newsletters

FREE access to all DM Review content

FREE access to web seminars, resource portals, our white paper library and more!

   

Could you please let me know the advantages of ETL tools versus traditional programming?

Q:

Question: We are planning to build a new warehouse for our reporting purposes. We have two options to load the data into the warehouse: using traditional mainframe COBOL or SQL programs for ETL or using ETL tools to do the ETL job. Even though we believe that going through ETL tools is the right approach, we are hard pressed by some technology groups asking us for the advantages using ETL tools as they are quite costly. Could you please let me know the advantages of ETL tools versus traditional programming?

 

A:

Sid Adelman's Answer: The question of whether to use an ETL tool or use standard programming should be based on the following: Are the source databases an ERP? Databases in the ERPs are usually arcane and complex and the ETL tools usually already have mappings to these products and so the difficult mappings to the source have already been done for you. Are you expecting to significant changes to the source data? If so, an ETL tool makes it much easier to maintain the ETL process. Will you be integrating data from multiple sources? If so, an ETL tool makes it much easier. What's your schedule? An ETL could (should) shorten the time to implement. How much transformation is required? The ETL tool will significantly help in the transformation effort. If you can't say yes to any of these questions, you are probably better off using standard programming.


Chuck Kelley's Answer:
Using the same logic that the technology groups are asking, you can say, "Why do we use a database? I can write all that in assembly language code, and it will be cheaper than buying a database and a third generation language." I am not sure that it holds much bite if this is their only argument. However, here are a couple of other points: Meta data - Most ETL tools automate the meta data acquisition process. You have to do everything manually with COBOL and SQL and it is hard enough to get programmers to document that! Ease of maintenance - Someone can look at the ETL code and maintain it without as long of a learning curve than reading COBOL and SQL (especially the SQL that I write!).

Les Barbusinski's Answer: It's often difficult to sell the need for an ETL tool at the beginning of a data warehouse project. All management sees is the steep purchase price. Savings are tough to quantify, and they don't usually accrue until the second or third release of the data warehouse. Nevertheless, the benefits are real. Here's a partial list:

Productivity

  • Coding ETL scripts using a metadata-driven graphical tool with built-in data cleansing and transformation functions is generally faster than hand coding.
  • Mappings, extract rules, cleansing rules, transformation rules, aggregation logic and loading rules are generally handled as separate objects in an ETL tool. This means that you can change one object in an ETL "string" without affecting the other objects. For example, you can change the loading logic for a particular target table (say, from direct insert to generating a flat loader file) without affecting the cleansing and/or transformation logic for that table. This compartmentalization eases maintenance, and reduces the need for retesting.
  • Objects in an ETL tool (e.g., transformation rules) can be reused.
  • ETL tools facilitate impact analysis when modifying or enhancing a data warehouse.
  • Many ETL tools provide "adapters" for extracting data from packaged ERP applications (such as SAP or PeopleSoft), as well as software packages for "verticals" (e.g, banking, manufacturing, distribution, etc.).
  • Some ETL tools have meta data repositories which can integrate with the metadata in your BI tool.

Methodology

  • ETL tools impose a certain level of structure, rigor, and consistency in your development approach. Unless you have strictly enforced standards and/or well-designed script templates and sub-routine library in your shop, hand-coding your ETL scripts can result in a lot of hard-to-maintain "spaghetti" code.

Documentation

  • The meta data trapped by an ETL tool graphically documents source and target database structures, mappings (a.k.a. "data genealogy"), cleansing rules (a.k.a. "business rules") and transformation rules. Such documentation is invaluable during impact analysis, or when bringing new team members up-to-speed on an on-going DW project.
Of course, you can achieve much the same results by hand coding your ETL scripts in PL/SQL, C++, DTS, etc., but your results may not be consistent (i.e. you're team is only as good as your weakest developer). Hope this helps.
   

Sid Adelman is a principal in Sid Adelman & Associates, an organization specializing in planning and implementing data warehouses, in data warehouse and BI assessments, and in establishing effective data architectures and strategies. He is a regular speaker at DW conferences. Adelman chairs the "Ask the Experts" column on www.dmreview.com. He is a frequent contributor to journals that focus on data warehousing. He co-authored Data Warehouse Project Management and is the principal author on Impossible Data Warehouse Situations with Solutions from the Experts and Data Strategy. He can be reached at (818) 783-9634 or visit his Web site at www.sidadelman.com.

Chuck Kelley is an internationally known expert in database and data warehousing technology. He has 30 years of experience in designing and implementing operational/production systems and data warehouses. Kelley has worked in some facet of the design and implementation phase of more than 50 data warehouses and data marts. He also teaches seminars, co-authored four books on data warehousing and has been published in many trade magazines on database technology, data warehousing and enterprise data strategies. He can be contacted at chuckkelley@usa.net.

Les Barbusinski is vice president of technology and co-founder of Digital Symmetry, LLC, a consulting firm that specializes in the design and development of data warehousing and business intelligence solutions. He has more than 20 years of experience in data warehouse and operational systems development and provides hands-on expertise in data warehouse design, development and project management. Les can be reached at dwexpert@dsym.com.

For more information on related topics, visit the following channels:



Industry Vendors