Data Warehousing: What’s Next? Part 9: Know Your Consumers
The last installment defined the new capabilities of packaged analytics. A future installment will cover the implications of exploiting packaged analytics in support of data warehousing. I need a little more time for research to get this topic right.
For this installment, we will advance to a topic essential to our next major category: Redirecting your data warehouse program to higher value opportunities. In Part 3, the terms consumer role and information supply chain were introduced. Here we explore these concepts in depth to help you get to know your consumers.
Why is Data Warehousing so Hard?
Success in data warehousing is both elusive and fleeting. A data warehouse project can run afoul of all the normal information technology gremlins: ambiguous objectives, poor design, faulty execution, bad management or simply having the priorities change. Many specific factors have been advanced for failure of data warehouse projects. From my perspective, they can all be lumped into the following four categories:
Too Technical - The data warehouse team focuses almost exclusively on building a technological infrastructure to acquire and store data. Delivery concerns end with the selection and implementation of a generic query/reporting/OLAP tool. Usage scenarios are not analyzed in any depth so it is impossible to optimize for anticipated access needs. Little emphasis is placed on business rationale and return on investment. There should be no surprise when these projects fall off the radar screen.
Too Tactical - A data warehouse, by any measure, is an expensive undertaking. The objectives must be high impact and strategic to be justified. All too often a data warehouse is commissioned for such low value purposes as replacement of legacy reporting systems or simply as an adjunct to an ERP implementation. The worse goal is to provide a quicker, faster, better way to do what is already being done. Without a strategic goal, the data warehouse will be just money down the drain.
Too Broad - A strategic goal must be specific and measurable. Unfortunately there is a tendency to produce mission statements with expansive but nebulous rhetoric in lieu of tangible goals. These projects fail because nobody has a clear idea what results they are intended to produce. They are the first casualties in a downturn.
Too Narrow - By far the greatest tendency is to label any database- centric reporting project as a data warehouse to give it the luster of a fad. Many projects are little more than new technology departmental reporting solutions. These projects fail because their ambition is too limited and their scope is too small to provide the type of payback that data warehousing promises to deliver.
What is Unique about Projects that Succeed?
Alone, or in combination, these factors can account for virtually all the failures. But is there something more to be learned by looking at those projects that succeed? Is there something the successful do that gives them the edge?
When we researched this question, we found a common theme that should not be surprising in retrospect: The teams that succeed understand information consumption behavior. They know their consumers.
They understand there is a diversity of usage behaviors that must be supported by a diversity of preparation and delivery channels. They understand their job is to facilitate information flow rather than just designing data stores. They see a step-by-step process with data moving from original source to ultimate action. They define their job as supporting this information supply chain from end to end.
The first step in getting to know your consumers is to understand some basic facts about information consumer behavior.
1. Very few information consumers use raw data.
When you build a data warehouse, you acquire data from multiple sources, you validate it, you integrate it and you prepare it for use. A consumer may be able to access the data interactively and the data warehouse may provide a variety of channels to deliver the right content in the right form. However, no matter how well you do at these difficult tasks, no more than 20 percent of the real information consumers in the business are likely to use your data resources directly. In fact, the penetration rate may be as low as five percent.
Why? The easy answer is to blame the technology or peoples' willingness to accept the technology. If only the tools were easier! If only we didn't have so many computer-phobic users!
Yes, technology advances will reduce barriers to broader use. Yes, more computer-literate employees and better training will mean more people can use what you offer.
And yet, all your valiant efforts will net only marginal improvements in the rate of penetration if you design a data warehouse that only serves the non- existent "end user." It is interesting how we use common phrases like "end user" without stopping to consider the full extent of their meaning.
Regardless of the significant investment in data acquisition and storage, most data warehouses are only purveyors of raw data. The simple truth is the vast majority of information consumers can not use raw data. Their contribution is taking intermediate results prepared by someone else, adding value and sending it on along the information supply chain of the business. Most consumers are not end users; they are "middle users."
2. Most effective information is five to seven steps from the last IT managed source.
Let's define effective information as "content which is the direct cause of action." Let's define the last IT managed source as the "deliverable created by the official information technology group which is the original source for the underlying content in a consumer presentation document." Examples of IT-managed sources include a database table or a canned report. The consumer presentation document may be a hard-copy report, an electronic spreadsheet or a projected image of a slide presentation.
With these definitions in mind, you are ready to do a source and uses study. Walk around your enterprise and identify the presentation documents which contain the effective information used by a given decision-maker. Trace these documents back through all the intermediate destinations to the IT managed source(s). If your enterprise is typical, you will find, as we did, that data was manually manipulated in five to seven separate steps on its way from the IT source to final deliverable.
The person who first gets the data may not use it. The person who uses the data may not act on it. The person who acts on it may not know the data. This is the essence of the information supply chain.
The Information Supply Chain - A Macro View
Issue: Data warehousing typically addresses only the first few steps in the full information supply chain of the average enterprise.
Figure 1: Original Supply Chain
Figure 1 shows one path in an information supply chain sampled from a real business case. This diagram shows what we call the macro view from "get" to "act."
The person who first "gets the data" manually transcribes it from the report into a spreadsheet. This spreadsheet is then hand delivered to other recipients via "sneaker-net." A second person merges the subset of the original data with external industry data to produce a market share pro forma spreadsheet. She hands it on to a third person who adjusts the alignment between internal and provider product codes and regroups the data to represent a specific market view. A report containing this refined data is generated for wider circulation.
At this point, there are three documents in existence with successively more refined data: The IT report with raw data, the first spreadsheet and the market report. The data has been extracted, merged, reprocessed and repackaged by three different individuals. No action has been taken yet.
At the next step, the data is rekeyed again from the report to a desktop database. The intent is to merge it with forecast data already captured in the database (via another supply chain path). This allows yet another person to scan for selected products or time periods or other criteria to produce the view desired by a decision-maker. These figures may be a small table or even a single number on a slide in a presentation. These figures are the effective information at the end of the chain.
Action: This extended chain of manual processing is the greatest unanswered challenge facing data warehousing today. Data warehouse teams should partner with the business units to design support as far up the chain as possible from get to act.
Figure 2: DW Supply Chain Extension
Figure 2 shows another view of the same supply chain path. We have grouped the tasks into two new data warehouse processes to eliminate manual processing. The external market data and the forecast are now refined data available in the data warehouse. The intermediate deliverables are now delivered from the data warehouse in a consistent manner. This is an example of extending data warehouse support up the supply chain much closer to where the action really takes place.
3. Both productivity and consistency are lost due to redundant acquisition, preparation and presentation.
A data warehouse is generally designed to serve the first one, or at most a few, steps in the information supply chain. We help the initial consumer get the raw data. We may provide support for early usage steps. Rarely does our reach extend to the middle, much less to the end, of the chain.
We build a rich and deep database. We buy, install and support very expensive analytic tools. It is understandable that we assume our direct customer is the ultimate consumer. The start of the chain may be where the power (user) is but it is not where the action is.
The cost of not serving the middle-to-end of the chain can be high. The desktop processing at the upper end of the chain is massively inefficient when expensive personnel are used as data entry clerks. Consistency is jeopardized by high error rates, the lack of checks and balances, and capriciously volatile business processes.
A data warehouse is supposed to help increase the consistency of delivered results. This potential cannot be realized when so much desktop post-processing goes on between the data warehouse and the creation of effective information.
The Information Supply Chain - A Micro View
Issue: An excessive amount of process redundancy exists within each step of the information supply chain.
At each step in the chain, someone takes data in, does something with it and puts data out. Each step in the supply chain may involve one or more of:
- Capture of data
- Clean up and transformation
- Integration with new sources
- Preparation of new data structures
- Generation of new presentation formats
- Interpretation of new results
- Delivery of the old and new data to others
These tasks look very much like data warehouse services. They are just being performed by hand with desktop tools.
Action: The data warehouse team must analyze each step in the supply chain to determine the most efficient techniques for redesigning the process. Options include 1) capturing intermediate results produced by middle users, 2) replacing manual processing with data warehouse services, and 3) automating analysis.
4) There are more than two types of information consumer.
The traditional approach to categorizing consumers is by either business function or skill level. A business function category like sales or finance tells us very little about actual usage behavior. Within even a narrowly defined business function such as product marketing, there is a broad range of different consumer activity.
Grouping consumers by skill level is both common and essentially meaningless. What is the value of typing a consumer as a "general user" or a "power user?" It tells us nothing about what they do. It only tells us who has the biggest tool.
We need to understand the role of each information consumer in the supply chain if we are to align our service with the business optimally. What part do they play in the end-to-end flow of information? What is their specific contribution?
Our approach defines seven different types of direct consumer roles based on their usage behavior. Do they get data or use data or act on it. If they use the data, what is their specific value added? What do they do with it?
Our classification scheme for consumers starts with three major categories:
1. Indirect Consumers who do not interact with the data warehouse directly,
2. Direct Consumers who do interact in one of seven different ways, and
3. Value-Added Distributors who interact with both direct and indirect consumers to make their work more productive.
Both indirect consumers and valued-added distributors may have more business impact than direct consumers.
Indirect consumers are the largest category in the information supply chain but they come in the fewest varieties. The common characteristic of indirect consumers is that they do not interact with your data warehouse services. What differentiates them is their degree of data awareness.
- User is our label for indirect consumers who actively man a way station in the information supply chain. We intentionally reserve the ubiquitous term "user" for the greatest number of information consumers who process and deliver data throughout the organization. It is quaintly paradoxical that this group does not interact with the data warehouse utility directly.
- Skeptic refers to a rather curious breed that exists in most every organization. They claim they "don't do data" even though they clearly receive or supply information to others. Their reticence to acknowledge their involvement with data makes them the most difficult consumer group to support. Hopefully you don't have many skeptics.
The holy grail of the data warehouse process is to convert indirect consumers to direct consumers. This can only be accomplished if we extend data warehouse services up the supply chain.
Direct consumers are the active customers who interact with your data warehouse. They differ markedly in how they access and use the data resources. Many people perform a mix of the roles we introduce here. Generally, there is a dominant role which best depicts where they spend most of their time.
- Clerk - Generates results for others. A clerk executes reports, runs queries or performs a repetitive analysis on behalf of someone else. It is easy to assume that this is a narrow classification that includes mostly administrative assistants who fetch data for the boss. Yet, when you look closely at the behavior of your direct consumers, you will likely find that a large proportion of them do not make significant use of the data themselves. They package data for others up the supply chain. A good many people with titles that include analyst are predominately data clerks or at most trackers.
- Tracker - Scans for targets. A tracker uses data only to the extent of searching to see whether predefined targets have been met or exceeded. They look for values that are too low or too high. A tracker performs the "stoplight" function that many OLAP tools are designed to support. They are looking for things outside the norm but they do not establish the norm.
- Analyst - Seeks the cause. An analyst follows up where a tracker leaves off by searching for the factors which might explain why a value is too high or too low or has changed in an unexpected way. If sales are below expectation, where did we take the biggest hit? We might find it is in the Central region and then look further to see that we are okay on everything except new products which are drastically low. If we then correlate this with the fact that the Central region is the only area that has not had new product training, we have uncovered the potential cause. This is an example of the analyst role as we have narrowly defined it here.
- Forecaster - Projects the future. Trackers monitor trends. Analysts investigate changes to understand what caused them, if possible. A forecaster predicts the future. They propose what changes may be permanent and discount changes that may only be a momentary blip. They use many forms of external data to anticipate future supply and demand conditions. Essentially, they project trends and expectations into the future to provide a baseline for planners.
- Planner - Sets new targets. A planner sets new targets for the immediate future. They provide the boundary conditions for the trackers to monitor during the next period or so. They rely on a forecaster's projection, if one is available, or on someone's intuition. If your enterprise is lucky, planners may have a miner or a hunter to provide an alternate scenario that is more powerful than a projection or raw intuition.
Miners and hunters work out of the box. They seek to change the status quo and set the enterprise on a new track. They are the called upon when the organization is in a tight spot and needs to find a way out. They go into action when a business decides it is stale and needs a new direction to maintain growth or recover momentum.
- Miner - Searches for insights. A miner sifts through masses of data to identify patterns that have not been seen before and to provide insights that break out of the norm. A miner does not start with a preconceived notion but rather undertakes a journey of discovery that is data driven. One company mines trial market data to pick the single product and the best region in which to concentrate their last remaining funds in a desperate bid to save the business. Another firm that has been stagnant for several years explores their revenue and margin details in search of a growth opportunity. The surprising result is they should sell their generations old retail channel since their best money comes from direct and online channels.
- Hunter - Validates a vision. A hunter starts with a vision, a preconceived notion. They use data as a tool to substantiate or to support their vision of where the enterprise should go. One hunter may conceive of a means to cut tens of millions of dollars out of logistics costs by renegotiating with freight carriers rather than cutting headcount or compensation. All he needs is shipment detail for a predictive model to support negotiation. Another hunter might see a future in which an 80-year-old manufacturing firm sells all their plants on the way to becoming a media giant. She needs to marshal all the profitability and market trend facts available to win over the old-line management.
Value-added distributors (VAD) may be direct consumers themselves but their greatest impact is their role in helping other consumers be more effective.
- Builder - Creates custom solutions for consumers. A builder helps other consumers be more productive by creating custom analytic applications. A builder is often a departmental resource who is not a member of the formal information technology organization.
- Provider - Develops queries and/or delivers data. A provider uses the tools offered by the data warehouse to develop queries for others to execute. In some cases, they may run the queries for the consumer and deliver semi-prepared data. What makes them different from a clerk is that a provider is first and foremost a tool jockey. They may also be a subject-matter expert in the business data. A clerk has neither significant tool nor data expertise. They just run prepared queries, potentially with moderate tweaks.
- Mentor - Helps consumers learn the tools. A mentor may have the skills of a builder or a provider but they use their expertise to help others do it themselves rather than do it for them. A mentor can help an indirect consumer become a direct consumer or they can help an existing direct consumer become more proficient.
A builder is more interested in the application than the data or the business usage of the data. They are prone to ignore alternate methods of accessing or delivering information since they get their kicks from creating code. This makes them the least valuable VAD and the one most prone to be an antagonist, rather than an ally, of the data warehouse team.
A provider helps the data warehouse team support more people than they would on their own. They offer more leverage than can be gained by adding a new member to the support staff. They are closer to the customer. They may be in a general purpose MIS or DSS group. They are often a member of the same organization as the consumer they support. They know the business better. Most successful organizations have one or more providers at their beck and call regardless of the level or type of support officially offered by the information technology group. The provider is the first person they turn to when they need help.
A mentor in the business organization is a rare gem. Unfortunately, in technology circles, the clich that "those who can, do and those who can't, teach" is all too often true. The techie mentality drives many people who have the aptitude to be a good mentor into become a builder or a provider instead. A mentor is a unique individual who has the tools, data and business knowledge to be a provider but who gets their reward by showing others the best way to get the job done.
Some mix of these value-added distributors always exists. An effective data warehouse team seeks them out and aligns with them. You try to keep builders in check so they do not build yet another redundant custom analytic tool. When you find a mentor, make them an ex officio member of the team and give them any support they need.
The providers are the tricky ones. Their self-worth is defined by being the principal player in the information delivery channel for their group. They often sit at the right hand of the boss or at least have her ear. If you do not play your cards right, you will alienate this valuable resource who can do more harm than good from the outside.
Too often a data warehouse team believes it is their duty to provide support directly to all consumers. In some unfortunate cases, the mission of the data warehouse is to "empower the users to do it themselves" and to "eliminate departmental analysts." This sets the data warehouse team on a collision course with the existing providers. This is a bad place to be.
The successful team works actively to identify the trusted providers and to offer the services of the data warehouse to them. You treat them as your customer so they can better serve the business. You will get more bang for the buck this way than trying to do their job for them.
What are the implications of getting to know who your consumers really are and how they contribute to the enterprise? The key message is to align your services in a manner that provides the highest impact. You should:
- Support the information supply chain from the bottom, where you start out, up through the intermediate indirect consumers to get as close to the action as you can.
- Identify and enlist the aid of the value-added distributors that already exist in your organization. Develop and nourish this valuable channel of support.
- Customize your services to the specific role of each direct consumer.
Not enough can be said about this latter point. You will be far more productive when you know the roles your consumers play and you adapt your game plan to match their needs. You can't ask a clerk how the data will be used but you can make their life easier by providing better ways to select and present content. You can't ask a tracker the rationale for a target or goal (that is the specialty of a planner) but you can remove the drudgery from the task by providing active alerts.
When someone is expected to be an analyst, forecaster or planner and they are, in reality, mired in the tasks of collecting and assembling data, you can free them up to do their real job. Fundamentally, our job is to automate the boring bits, the repetitive tasks which keep our business partners from doing the best job they can.
Help the clerks and the trackers fulfill their role more efficiently but do not get bogged down in these low value tasks that dominate most data warehouse teams. Spend the bulk of your effort supporting analysts, forecasters and planners. Keep an ever-vigilant lookout for the miners and the hunters. They know where the gold is buried.
The odds against success in data warehousing seem enormously high. But you don't need to play the odds. The more you take the task of knowing your customer to heart, the better able you will be to see the big hit opportunity that lurks somewhere in every enterprise. It is there, you just have to find it.
For more information on related topics visit the following related portals...
DW Administration, Mgmt., Performance and
DW Design, Methodology.
Michael Haisten, vice president of business intelligence at Daman Consulting, is considered one of a handful of visionaries who have helped shape the data management industry. He has accrued more than 22 years of leadership in information management and architecture development. Haisten served as chief information architect at Apple Computer Corporation, where he developed the data warehouse management team. He has been chief architect, designer and technical expert for more than 72 data warehouse and decision support projects. In addition, Haisten is the author of Data Access Architecture Guide and Data Warehouse Master Plan and has published extensively on data warehouse planning, data access facilitation and other key aspects of data warehousing. You can contact him at firstname.lastname@example.org.
Provided by IndustryBrains
|Easy Software Migration to SAP|
If your current applications are at risk, SAP Safe Passage provides a clear roadmap for solution migration with maintenance support & integration technology. View free demos now!
|Dedicated Server Hosting: High Speed, Low Cost|
Outsource your web site and application hosting to ServePath, the largest dedicated server specialist on the West Coast. Enjoy better reliability and performance with our screaming-fast network and 99.999% uptime guarantee. Custom built in 24 hours.
|Design Databases with ER/Studio: Free Trial|
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.
|Data Quality Tools, Affordable and Accurate|
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.
|Data Mining: Strategy, Methods & Practice|
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.
|Click here to advertise in this space|