-
Marketplace
-
Channel Resources
Articles from this Site
With respect to ETL integration from one system to the other, what steps needs to be considered when planning an aquisition or merger?
What are your views on the advantages and/or disadvantages ETL tools and data modeling versus code?
How can I enhance the productivity of my fact table load?
Please give some suggestions to prepare an estimation effort for our ETL process.
How important is data reconciliation for an ETL application?
White Papers
Sunopsis Integration Suite: An Evaluation by Bloor Research
Third Generation ETL: Delivering the Best Performance
Advanced ETL with Pentaho Data Integration
Evaluating Real-Time Data Integration Solutions
Books
Please give some suggestions to prepare an estimation effort for our ETL process.
Question: I need to prepare an estimation effort for our ETL process. Is it right to calculate the effort only based number of sources, targets and transformations? If it is not only based on above criteria what are the other criteria which we need to follow during estimation effort preparation? Also let me know the percentage of complexity level and data quality issues need to be taken into account. It would be great if anyone could post a sample ETL estimation effort template for reference.
Sid Adelman's Answer: In your question you've identified some very important determinants of ETL effort. Here are a few more thoughts you need to consider:
- How well documented are the source files?
- How knowledgeable are the ETL developers with the source data?
- How clean does the data need to be? We often find that some data does not have the same data quality requirements.
- How knowledgeable are the ETL developers with the ETL tool?
- Will the ETL developers be assigned full time to the project?
- How well is the project being managed?
- How much data has to go through the ETL process? Very large amounts of data result in challenges (read problems) that take time and effort to correct.
Clay Rehm's Answer: I don't know where that technique came from - based on how many sources, targets and transformations there are. That technique simply misses so many other factors, such as:
- What is the skill level of the programming resources?
- Availability of resources (how many other projects are they working on, other time off for sickness and vacations)?
- Who is doing the testing?
- Who is writing the test cases?
- How well were the test cases written?
- How well were the requirements written?
- If the scope is changing yet?
- What is the level of data quality?
- What is the level of understanding of the data?
My suggestion is to perform some research before providing estimates. This can be done by:
- Reviewing how the data will be used,
- As well as reviewing the data in the data sources by writing queries and manually looking at the data, and
- Performing some simple pseudo coding of the solution first.
For more information on related topics, visit the following channels:


