FREE DM Review Site Registration!
Sign-up today and access DM Review on the Web!

Your FREE registration entitles you to:

FREE email newsletters

FREE access to all DM Review content

FREE access to web seminars, resource portals, our white paper library and more!

   

Publisher reserves the right to serve qualified requesters only.

Please give some suggestions to prepare an estimation effort for our ETL process.

  • DM Review Online, October 1, 2007

Question: I need to prepare an estimation effort for our ETL process. Is it right to calculate the effort only based number of sources, targets and transformations? If it is not only based on above criteria what are the other criteria which we need to follow during estimation effort preparation? Also let me know the percentage of complexity level and data quality issues need to be taken into account. It would be great if anyone could post a sample ETL estimation effort template for reference.

Sid Adelman's Answer: In your question you've identified some very important determinants of ETL effort. Here are a few more thoughts you need to consider:

  1. How well documented are the source files?
  2. How knowledgeable are the ETL developers with the source data?
  3. How clean does the data need to be? We often find that some data does not have the same data quality requirements.
  4. How knowledgeable are the ETL developers with the ETL tool?
  5. Will the ETL developers be assigned full time to the project?
  6. How well is the project being managed?
  7. How much data has to go through the ETL process? Very large amounts of data result in challenges (read problems) that take time and effort to correct.

Clay Rehm's Answer: I don't know where that technique came from - based on how many sources, targets and transformations there are. That technique simply misses so many other factors, such as:

  • What is the skill level of the programming resources?
  • Availability of resources (how many other projects are they working on, other time off for sickness and vacations)?
  • Who is doing the testing?
  • Who is writing the test cases?
  • How well were the test cases written?
  • How well were the requirements written?
  • If the scope is changing yet?
  • What is the level of data quality?
  • What is the level of understanding of the data?

My suggestion is to perform some research before providing estimates. This can be done by:

  • Reviewing how the data will be used,
  • As well as reviewing the data in the data sources by writing queries and manually looking at the data, and
  • Performing some simple pseudo coding of the solution first.

For more information on related topics, visit the following channels:



Industry Vendors