DW Project Management: Using Agile Software Development Frameworks for Data Mart Development
As a practitioner of the bottom-up development methodology, advocated by Ralph Kimball and my colleague Pieter Mimno, I am involved daily in the creation of enterprise data warehouse applications using an iterative bottom-up approach. However, as a "software development geek" with a long history managing many forms of application development, I believe it is time to apply the already proven agile development frameworks used by other forms of software development to our data mart development toolkits.
Iterative, bottom-up development should not be just a traditional waterfall process model done to a fixed time-box. Instead, a new approach is needed to achieve ROI on data warehouse projects more quickly, thereby keeping our customers (business sponsors) more actively engaged in our process. In addition, our development process needs to become measurable, reliable and consistent to continue achieving that ROI beyond just the first prototype or mart.
Agile Software Development Frameworks
Agile development is a software development approach that "cycles" through the development phases, from gathering requirements to delivering functionality into a working release. Two widely used agile development frameworks are the Microsoft Solutions Framework (MSF) and Extreme Programming (XP). Both are similar in their goals and implementation, with wide application to data warehouse projects.
To get us started, I have culled the following common principals from MSF and XP, which we can directly apply to our data warehouse projects:
- Software is intellectual property applied to meet a business need and the more interactivity between people the better the intellect. Using a shared vision and small teams, agile development strives to involve as many people as possible in the quest to best solve the business requirement.
- Make frequent releases. Agile development strives to deliver small units of functionality that make good business sense. Frequent releases improve your relationship with your customer, especially if they occur on a regular basis.
- Relentlessly manage scope. Meeting a fixed release schedule will not happen unless you actively manage the resource triangle. The resource triangle 1is the three-way combination of requirements, time and resources. Any change to one leg of the triangle (misunderstood requirement, less time or fewer people) requires a corresponding change to at least one other leg. Agile development teams triage daily, often with customer involvement, making daily decisions about scope items.
- Create a multi-release framework. Agile development stresses that there must be a master plan and a supporting architecture. Use releases to add more customer functionality, not constantly rework what was done in the past.
Can We Apply This to Data Mart Development?
Sure, the ultimate goal of any bottom-up development project should be to roll out new data mart functionality on a regular and rapid basis with a high degree of conformance to what was already there.
By adopting specific practices from MSF and XP, we can easily facilitate the bottom-up, frequent release approach and, even more importantly, change our project team culture and associated behaviors to create better, more customer-focused applications than with the traditional waterfall approach.
Over many years of practice and corresponding mistakes, I have culled ideas from these frameworks into a process I use on my own BI development teams. As we go through this process, I'll point out major MSF and XP constructs, and supply end notes where you will find a full description about it. As a frame of reference, this example assumes we have completed our JAD sessions to capture end-user requirements and created an initial data model supporting those requirements. You will find JAD with its focus on business requirements and high-level design, very compatible with this iterative process.
Step 1 - Create and Prioritize the Stories
Taken from XP, stories 2 describe a user requirement just enough so that a rough scope of effort can be provided. Don't confuse a story with a requirements document; a story is a succinct problem statement typically no more than three sentences long. For each story, meet as a team (include all disciplines) and provide an initial assessment of how the story could be developed along with a rough scope of effort.
With an initial scope of effort established, you can now work with your customers and stakeholders to create a multi-release plan.3 This document, advocated by MSF, enhances and externally communicates your project team's decisions about what must be done now versus what can be deferred until later. It also provides a vision to the team beyond just the current release. Be sure to include a quantifiable business result in your multi-release plan. An example would be, "Improve call center productivity by 20 percent in 12 months." Use this quantifiable result to focus your multi-release plan, and the subsequent changes to it, as the primary business objective.
MSF also advocates using scouts.4 Scouts work ahead looking at the state of anticipated dependencies, working in advance with customers or laying out methods or procedures for the next release. They often work to validate and update the multi-release plan.
Step 2 - Create the Release Plan
Release plans document the goals for the next customer deliverable and provide a baseline for any future tradeoffs required to balance the resource triangle. The plan should also include the number of cycles you will execute for the release. Cycles are individual iterations 5 designed to either produce a set of milestones or stabilize the release. You will want to define two types of cycles: development cycles and stabilization cycles. Development cycles are longer duration, 10-15 days maximum, designed for development of major stories. Stabilization cycles are much shorter duration cycles, 5-10 days maximum, designed for bug fixes and fine-tuning from customer feedback. Based on your release timeline, you should be able to define a fixed number of each cycle type to achieve your release date.
For example, say you want to release a specific submart in 60 days, using a 10-day development cycle and a 5-day stabilization cycle. You could plan four development cycles and four stabilization cycles for a total of 60 days. To help emphasize the goal of each cycle to your project team, give each cycle a progressive name (Dev 1, Dev 2, Stab 1, Stab 2).
Now assign into each development cycle, the stories you have selected for development in this release. Use the rough scope to balance the number of stories that fit in each cycle, based on your resource triangle. XP calls this establishing your base project velocity 6 and advocates writing each story on one of three different sizes of paper based on its rough scope. Then all the pieces of paper are fitted on a bulletin board or to a taped-off area on a white board, reflecting the total development scope for that specific cycle.
Knowing the stories assigned to each development cycle also give you clear deadlines when the supporting specification documents are due. We use a deadline of one cycle before it is assigned to allow for a final specification review by the project team.
Finally, plan the specific cycle when you will freeze your critical design elements, especially your data model. Ideally, you are looking to freeze all critical design elements before the last development cycle and ruthlessly challenge any changes to those design elements beyond that point.
Step 3 - Execute the Cycles
Each cycle results in a test release, and it is very much worth the effort of moving the developed results of each cycle into a separate test environment even for the earliest development cycles.
In addition to the obvious benefit for QA and early demonstration to your customer, remember that any software solution is a coordinated delivery of the elements needed to solve a business problem.7 Practicing this delivery, especially identifying all the elements, will pay big benefits later on. In our projects, we treat each cycle as a software product, complete with release notes, responsibility plan, object migration plan and installation instructions. We also periodically delete the test database and rebuild it using just our release documentation, as a final check in making sure we have identified and packaged all the database, ETL and BI objects we have developed.
As we execute each cycle, all acts of discovery, whether they are specification flaws, development issues or bugs, are entered into our issue tracking system. Here we strive to make them as understandable as possible by equating the issue back to the story or requirement. Since one goal of our issue tracking process is to promote a team understanding of the issue and elicit the "best ideas" for solving it, I can't overstate the importance of writing a good issue description.
During each cycle, we triage our open issues daily to manage our scope. Our triage meetings are a team event, preferably first thing in the morning. XP calls this the daily "stand-up" meeting8 and suggests that the team stand in a circle as an aid in keeping the meeting short and to the point. We strive to make this the only formal meeting of the day, and all team members are expected to attend. In addition, once the word gets out that this is the meeting where scope decisions are made, our customers and stakeholders often also attend.
We dedicate a program manager to the triage and release process. This is a MSF team model concept. 9The program manager runs the triage meeting and issue tracking system, and since everything we do during our cycles revolves around issues, the program manager effectively controls the project team. The program manager is also responsible for accumulating, approving and distributing the release documents.
Our QA works a cycle behind the development team, performing an initial acceptance test for that cycle and then executing their test plans required for the type of cycle delivered. For development cycles; they are getting a first look at the developed stories for the creation of their final test cases. During stabilization cycles; they are testing fixes, performing regression tests and testing the stories. Each cycle is graded for completeness, bug levels and test coverage by the QA team. By examining these grades during the stabilization phase, you can predict when your release will be stable enough to migrate forward.10
I also advocate using a daily schedule, especially in the later cycle stages. Here we lay out the activities we want to iteratively perform for each cycle and assign the specific days of the week we want to perform them. For example, consider a weekly stabilization cycle where we release to the test system every Friday. In this case we might want all database object changes identified by Tuesday, all resolution designs finalized by Wednesday and all resolution development completed by Thursday. Using a daily schedule helps your team members adjust to an iterative process, and keeps that process iterative rather than incremental.
You will likely have some management challenges as you execute your cycles, here are some things to watch out for:
- Delaying a cycle due to some issue. Never delay a cycle because of some act of discovery. Push the issue into some future cycle or remove it from the release altogether. Our goal here is to create periodic points where we can access our development state. If you delay, your state is unknown.
- Too much change. Watch your rate of introduction of change, especially in the development cycles (it's assumed in the stabilization cycles). Data model changes are a good example. Major schema rework just results in the same work being performed over and over again. Create "bliss" cycles where developers can focus just on moving forward, even if the data model has a few flaws. Remember the goal of any development task is to make the "unknown, known," and you need to move forward to discover what else you don't know, or it will surprise you later.11
Step 4 - Deliver the Release
Delivering your release to the next environment forward, probably user acceptance, should be the biggest non-event possible. You know your quality trend from the QA grading of each cycle, and you now have a very defined and practiced release process. Your project team should be entirely ready for Murphy's Law.
Don't forget to post-mortem your release. A no-blame post-mortem12 of your team's successes and failures is the way in which true learning occurs and project team behavior changes. Another thing to measure in your post-mortem is your final project velocity. How many stories of each type could you accomplish in each cycle. Trend this velocity for your next releases.
If you haven't done so already, definitely get your scouts out now. Have them look at the challenges coming up in the next release, work on refining the next round of stories, and work on refining the data model for the next release. Get a jump on the next release while the remainder of your project team supports the acceptance testing. Now is definitely the time to reduce resources anyway, as too many cooks...
I have used the above process for all types of development cycles; new from scratch, major revisions and just stabilization. The only things I change are the duration and number of each type of cycle.
This uniform process rapidly promotes project team maturity to the point where we rarely miss our release commitments. We can then focus much less on our project timelines and much more on putting our "best ideas" into our applications.
As the one typically responsible to the customer, I also appreciate the fact that our development state is always measurable and publicly known.
I hope you try some of these ideas or even better become a student of agile development frameworks and develop new techniques of your own.
As Jim McCarthy said in his classic book, Dynamics of Software Development, "More people have ascended bodily into heaven than have shipped great software on time." As data warehouse project managers, let's start applying proven iterative frameworks so that our application development processes become more measurable, reliable, creative and fun.
1. Microsoft MSF Team, MSF Process Model, v3.1 (June 2002), www.microsoft.com/msf, p. 12.
2. chromatic, Extreme Programming Pocket Guide, North Sebastapol: O'Reilly Media, 2003, p.39.
3. Microsoft MSF Team, MSF Process Model, v3.1 (June 2002), www.microsoft.com/msf, p. 19 or
McCarthy, Jim, Dynamics of Software Development, Redmond: Microsoft Press, 1995, p. 24.
4. McCarthy, Jim, Dynamics of Software Development, Redmond: Microsoft Press, 1995, p.36
5. chromatic, Extreme Programming Pocket Guide, North Sebastapol: O'Reilly Media, 2003, p.45.
6. Ibid. p. 47.
7. Microsoft MSF Team, MSF Process Model, v3.1 (June 2002), www.microsoft.com/msf, p. 9.
8. chromatic, Extreme Programming Pocket Guide, North Sebastapol: O'Reilly Media, 2003, p. 50.
9. Microsoft MSF Team, MSF Team Model, v3.1 (June 2002), www.microsoft.com/msf, p. 17.
10. Microsoft MSF Team, MSF Process Model, v3.1 (June 2002), www.microsoft.com/msf, p. 36.
11. McCarthy, Jim, Dynamics of Software Development, Redmond: Microsoft Press, 1995, p. 99.
12. Microsoft MSF Team, MSF Process Model, v3.1 (June 2002), www.microsoft.com/msf, p. 45 or
McCarthy, Jim, Dynamics of Software Development, Redmond: Microsoft Press, 1995, p. 117.
Kimball Ralph and Ross Margy, The Data Warehouse Toolkit, Second Edition, New York: John Wiley & Sons, 2002.
McConnell, Steve, Rapid Development, Redmond: Microsoft Press, 1996.
Microsoft MSF Team, Microsoft Solutions Framework version 3.0 Overview, (June 2003), www.microsoft.com/msf.
Microsoft MSF Team, MSF Project Management Discipline, v1.1 (June 2002), www.microsoft.com/msf
Microsoft MSF Team, MSF Readiness Management Discipline, v1.1 (June 2002), www.microsoft.com/msf
Microsoft MSF Team, MSF Risk Management Discipline, v1.1 (June 2002), www.microsoft.com/msf
Wells, Don, www.extremeprogramming.org, 2001
For more information on related topics visit the following related portals...
DW Design, Methodology and
Project Management / Development.
David Chu is currently a consultant with Mimno, Myers and Holum, where he manages large-scale enterprise data warehouse initiatives. With more than 25 years of software development experience, Chu has been managing business application development using agile development frameworks since 1995. He can be reached at email@example.com.
Provided by IndustryBrains
|Design Databases with ER/Studio: Free Trial|
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.
|Data Quality Tools, Affordable and Accurate|
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.
|Free EII Buyer's Guide|
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.
|Data Mining: Levels I, II & III|
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.
|Click here to advertise in this space|