-
Marketplace
-
Channel Resources
Articles from this Site
What is the Best Organizational Chart for Performance Management?
Statistical Learning for BI, Part 1
NorDx Uses InforSense
DataMentors Selected by BAI
Meijer Selects QuantiSense for Retail Business Intelligence
White Papers
HP ERP Business Intelligence
Business Intelligence for Tax Planning: Value, Strategy, and Vision
Single Sign-On for Webintelligence
A Structured Method for Specifying Business Intelligence Reporting Systems
Business Intelligence in a Real-Time World
Web Seminars
Looking for speed and accuracy in your financial planning and budgeting?
Hyperion Visual Explorer: Improve Visibility into Performance Management
Reducing the Cost of Deploying and Managing Data
Combining Microsoft Business Intelligence with the Teradata Warehouse
Espresso Shot Web Seminar: Uncorking the Data Bottleneck with Operational BI
Books
Measuring Database Operations from the Human Perspective
Database "Performance"
How well do our databases perform for the business, and by what standard can that be measured? The answers depend on what we mean by "perform."
When we use typical database monitoring tools to assess performance, we get answers from a certain perspective, which tends to be somewhat technical, and focused inwardly on the database itself. Is that a bad approach? No, actually it is quite valuable. We generally find these answers to be both useful and at least partially satisfactory. They provide important insights.
But we may also find ourselves wanting more, and for good reason: these purely technical, introspective metrics describe the performance of the machine, but they don't really describe database performance from a human and business perspective.
This common approach to database monitoring produces metrics about the internal workings of the operating system and the database such as cache hit rates, index usage, CPU usage, disk activity and so forth. These technical metrics make up the core of traditional monitoring.
The Human Dimension
However, measuring database performance using only introspective technical metrics is like measuring the performance of a skilled musician by monitoring only his vital signs, such as heart rate and blood pressure. Of course, this would be absurd. It's obvious to everyone that we wouldn't measure a musician's performance by how well his internal systems are functioning. In fact, we don't even think about such things; we listen to the music!
In the same way, the "performance" of a database goes well beyond how efficiently the database engine is running internally. There is a human dimension to database performance that is just as important as the internal database performance, and arguably more important.
What is the human dimension of database performance? It is the interaction of people and their applications with the database. It is everything happening outside the database rather than inside. It is the "music" of the database.
To understand this dimension, expanded metrics are required, such as who is using which data, when, how, how long, how often and from which applications. These are "database usage" metrics.
By learning how to measure and track database usage, IT departments can make more informed decisions toward improving the overall performance of their databases, controlling costs, increasing user satisfaction and improving IT resource planning.
Which Databases?
Is usage tracking worthwhile for all databases, or only for some? In general, the more complex a database is, the more benefit can be gained. How, then, can we decide if a database is simple or complex?
Simple databases generally have a very limited purpose, such as online transaction processing (OLTP), in which technicians can predefine and control the types of transactions that will occur and can easily optimize performance for those transactions. There is predictability about which data will be accessed and exactly how.
However, many OLTP systems have evolved to mixed-use, in which non-OLTP activity may interfere with OLTP transactions. Transactions that usually work fine may become slow or even lock up altogether due to other, non-OLTP transactions. Mixed-use databases are complex, and are definitely good candidates for usage tracking.
Databases that manage historical information for business intelligence (BI) systems are also complex. Such databases may be called data warehouses, data marts, decision-support databases, operational data stores and so forth. Whatever the name, their usage patterns can change frequently, and the nature of the work being done by such databases is inherently unpredictable. Such databases benefit greatly from usage tracking.
What Are the Benefits?
Usage tracking helps IT to manage the unique challenges of complex databases. As complex databases have become a regular part of business, various problems have arisen:
- Usability - Efficiency of access to data becomes an issue in the face of large amounts of data, and response times can go beyond acceptable limits.
- Predictability - The growth of the workload in the IT environment becomes unpredictable.
- Manageability - Undisciplined and unstructured growth in the volume and usage of data becomes very hard to manage.
- Cost - The amount of money spent on the IT data environment, especially database licenses and storage subsystems, becomes very significant.
The challenge for IT is to understand the underlying factors that drive these problems and how they can be managed and improved to ensure end user satisfaction. Then IT can make informed decisions on how to improve the usability, predictability, manageability and cost of their databases.
Fortunately, usage tracking technology is ideally suited to shedding light on these underlying factors, which include:
- Transaction patterns,
- Data usage patterns, and
- Storage requirements.
Let's look at these underlying factors and how usage tracking helps us to manage them.
Transaction Patterns
In simple databases, predefined transactions typically touch just a few rows of data, so response time is predictable and generally immediate. And these transactions generally occur at predictable times of the day.
In complex databases, however, the transactions can have unpredictable response times from subsecond up to many hours, involving just a few rows of data or millions. And the transactions may occur at unpredictable times of the day.
There are two relevant response time measurements for queries against complex databases:
- The time it takes for the first data to be returned, and
- The time it takes for the last data to be returned.
In some cases, the difference between the two times will be minimal. But in other cases, the difference will be substantial. Both measurements are needed because some will care about how long it takes to get any data, while others will care only how long it takes to get all the data.
It is also important to look at the differences in response times throughout the day. Many factors can affect these response times. Some items to consider are:
- How many queries were running at that time of day,
- What types of queries were being run,
- What was the actual SQL text the user submitted,
- Who were the users, and
- Which applications were in use.
By developing a comprehensive view of daily activities and the time of day when they occur, management can identify ways to manage the varying workload. By also measuring response times across dimensions such as department, user , query type, or which tables are accessed, IT can work with users to determine if the response times are a problem and if changes can be explored.
Many changes are possible. Perhaps new query tools are needed, or users may be using an improper or unsupported SQL generator. Maybe queries could be run outside of peak usage times, when the workload is lighter, to smooth out response times. Perhaps there are network issues that can be addressed. Perhaps a SQL tuning tool may be needed. Users may also need additional training.
IT management needs to be aware of how the choice of tools or lack of training can affect performance.
Data Usage Patterns
Like transaction patterns, data usage patterns are also fundamentally different between simple and complex databases. In simple databases, transactions usually follow predictable access paths through the data. But there are an infinite variety of transactions that may occur in complex databases, and there is no predictable pattern to the complex access paths that may be used.
Complex databases attract unpredictable questions. Often the results drive the users to ask additional questions, adding to the volume of unpredictable queries. Ad hoc querying of a database makes it extremely difficult to tune the database, something that is easily done in a simpler environment.
Tracking all queries submitted helps identify the data important to the end users. Once usage becomes clear, then patterns can be understood and we can improve the data design. For example, if users are repeatedly making queries that manually join and summarize two tables, adding a denormalized summary table might be the answer.
To effectively manage data usage, we must be able to identify:
- Which data are being used,
- How many rows are being returned to the end user after an access has been made,
- Which tables are frequently joined and using what columns,
- Which columns are used for row selection, and
- Whether repetitive queries are being submitted.
Once usage patterns are understood, IT can take the following steps to improve access to the data and, ultimately, the performance of the queries:
- Indices can be added,
- Redundant units of data can be intelligently deleted,
- The physical location of data can be optimized,
- Tables with like patterns of access can be merged,
- Summary tables can be built, and
- Standard queries can be designed.
Once the pattern of access and usage of data are understood, there are many ways in which the data can be manipulated in order to make data access most efficient. However, without knowing about the usage of the data, there is no way to intelligently reorganize it. Usage tracking allows us to find patterns so we can ensure that the data design keeps up with the changing needs of the business users.
Storage Requirements
Another major issue with complex databases is managing and controlling the volume of data. They may contain multiple years of historical information, and huge amounts of disk space are required.
Complex databases often grow at an astonishing rate. When they first grow beyond their hardware capacity, we generally expand disks, processors and database licenses. But at some point in time, as growth continues, the question must be asked: "Is all of the data actually going to be used?"
There will be a point at which significant amounts of data have collected, yet much of it is not being used. When this point is reached, economically and technologically, it makes sense to remove unused data and reuse the space it has been occupying rather than to purchase new storage. In addition, by removing dormant or redundant data, user access times are improved.
Usage tracking allows IT to manage performance by finding out which data are actually being used and which data are not being used. By measuring and tracking data usage, IT management can remove data intelligently - that is, remove unused data for the purpose of reclaiming storage. Usage tracking tools can reduce the cost of additional storage, processors and licenses and improve user experiences accessing data.
Usage Tracking Meets the Challenge
As we have seen, there is a powerful case to be made for adding usage tracking to our operational analytics toolkit, especially for complex databases. The key to effectively managing databases is to understand how the data are being used.
We saw that traditional monitoring focuses on the internal functioning of a database. By going beyond the internals and focusing on how users are interacting with the data, IT management gains a comprehensive view of how business actually uses the data environment.
By getting real-time and historical insight from external metrics - who is using which data, when, how, how long, how often and from which applications - regular improvements can be made that ensure peak performance of mission-critical applications across the enterprise. We can improve business performance, control costs, increase efficiency and effectiveness for users, achieve service-level promises and enhance the user experience.
In summary, by including usage tracking in our performance management strategy, we enable the business to react quickly in a competitive business climate, which ultimately ensures success both for the user community and for IT.
Ken Richardson is a senior director at Embarcadero Technologies, where he provides technical leadership for DSAuditor , a product employed for usage tracking against some of the largest databases in the world. Prior to Embarcadero, Richardson served as President/CTO of Ambeo, a software company acquired by Embarcadero in October 2005.
For more information on related topics, visit the following channels:


