The Transition of Data into Wisdom
||Column published in DMReview.com
November 22, 2000
Editor's Note: Jonathan Wu's November column is also featured in the November 22, 2000 issue of DM Direct.
Within the information systems and databases of an organization lie tremendous opportunities in the data collected. The organizations that can leverage technology to exploit their data will realize the benefits by creating a competitive advantage for itself. The competitive advantages are in the form of identifying trends, unusual patterns and hidden relationships that a competitor may not realize. This can be used to create new opportunities and give the organization an edge on its competition. Taking data and realizing the benefits involve several layers of understanding. Figure 1 depicts the transition of data into wisdom with a narrative of each layer of understanding.
Figure 1: Transition of Data into Wisdom
Since the invention of the database management system and advances in data storage technology, organizations have been collecting, processing, storing and accumulating vast amounts of data about people, locations, transactions, concepts and events that can be easily analyzed. A great deal of this data is associated with the functional processes of the organization. For example, a grocery store collects data about the items an individual purchases at the time of checkout. The grocery clerk scans the products into the system, and the system identifies the price of the item and calculates the total sales price. Through this transaction, the system has collected the following data elements: item, quantity, price, date, which cash register, the grocery clerk and, in certain cases, who conducted the purchase if a club card was used. Figure 2 is a representation of a transaction with sample data.
Figure 2: A Sample Data Transaction
| Item || Quantity || Price || Date || Register # || Employee ID || Club Card ID |
| Diapers || 1 || 4.99 || 11/1/00 || 001 || 213 || 1209 |
Transaction processing systems are capable of collecting and processing voluminous amounts of data, which is the foundation for higher understanding.
As the number of transactions that are processed and collected by the grocery store system increases, a wealth of data is collected. While each data element is a component of a transaction, what meaning can each data element provide? On an individual basis, data elements such as "item" do not provide meaning unless they are presented in conjunction with other data elements. The accumulation of data into a meaningful context provides information. BI applications that have ad hoc query and reporting capabilities provide users with the ability to extract data from a database and turn the data into information. For example, the accumulation of item, quantity and price provides information about the items that are purchased, the quantity and the price. By calculating the extended sales amount for each item, one can then rank and determine the item that generated the greatest and least sales by dollar amount. Figure 3 is a representation of the accumulation of data into information.
Figure 3: The Accumulation of Data into Information
| Item || Quantity || Price || Sales Amount |
| Beer || 265 || 6.85 || 1,815.25 |
| Cereal || 430 || 3.90 || 1,677.00 |
| Bread || 850 || 1.59 || 1,351.50 |
| Milk || 1100 || 1.20 || 1,320.00 |
| Diapers || 200 || 4.99 || 998.00 |
By taking data and placing it in a context that produces meaning, BI applications that have ad hoc query and reporting capabilities provide users with the ability to rise up from the data layer and create information.
While combining data and meaning to create information is extremely useful, separating or regrouping information extends the value of information. Applications that have online analytical processing (OLAP) capabilities provide users with the ability to analyze information and determine relationships, patterns, trends and exceptions. The data that was collected by the grocery store transaction system and the information drawn from the data can be further analyzed by separating the information by period. Figure 4 is a representation of separating information to create analytics.
Figure 4: Separating Information to Create Analytics
| Item || Period 1 || Period 2 || Period 3 || Period 4 || Total Quantity || Price || Sales Amount |
| Beer || 35 || 75 || 100 || 55 || 265 || 6.85 || 1,815.25 |
| Cereal || 110 || 110 || 100 || 110 || 430 || 3.90 || 1,677.00 |
| Bread || 200 || 215 || 235 || 200 || 850 || 1.59 || 1,351.50 |
| Milk || 200 || 300 || 300 || 300 || 1100 || 1.20 || 1,320.00 |
| Diapers || 10 || 20 || 50 || 120 || 200 || 4.99 || 998.00 |
From the table that lists item quantities purchased by period, we can conclude that diapers and beer purchases at the grocery store are influenced by the period while cereal, bread and milk purchases are consistent throughout all four periods. Our findings were developed after we performed further analytics on the information drawn from the grocery store data. By performing analytics that entail separating or regrouping information, relationships, patterns, trends and exceptions can be identified to provide further understanding about the subject matter.
The next level of elevated understanding is knowledge. Knowledge is different from data, information or analytics in that it can be created from any one of those layers, or it can be created from existing knowledge using logical inferences. BI applications that have data mining capabilities provide users with the ability to identify hidden trends and unusual patterns within the data. These BI applications utilize various data mining techniques which are based on statistics and algorithms to provide users with the ability to discover knowledge within their data. Deploying a data mining technique called rule induction against the grocery store data, it generated that people who buy diapers also buy beer 50 percent of the time.
The association of diapers and beer appears to be odd at first glance, but thinking back on my own buying patterns of those two products and raising two kids, I can understand and validate the correlation. Without the use of a data mining application, identifying hidden trends or unusual patterns within the data would be extremely time-consuming.
Wisdom is the utilization of accumulated knowledge. As we discovered within the data, an unusual purchasing pattern was identified. From this knowledge, one can examine the analytical data set in Figure 5 to develop a series of action items.
Figure 5: Identifying a Purchasing Pattern
| Item || Period 1 || Period 2 || Period 3 || Period 4 || Total Quantity |
| Beer || 35 || 75 || 100 || 55 || 265 |
| Diapers || 10 || 20 || 50 || 120 || 200 |
| Correlated Purchases of Beer || 5 || 15 || 25 || 55 || 100 |
In periods 1, 2 and 3, additional sales of beer occurred above the rule that people who buy diapers also buy beer 50 percent of the time. However, in period 4, there were no additional sales of beer above the rule. By utilizing the newly discovered knowledge, we can analyze the beer marketing campaigns in period 4 compared to period 3 to determine effectiveness or change in strategy with the goal of increasing beer sales in period 4. We would also want to review period 2 purchases of diapers and beer to understand what events contributed to additional sales of beer above the induced rule. By utilizing knowledge, a higher level of understanding of the data is created.
Organizations that have been collecting data from their transactional systems have the opportunity to realize potential of the data as an asset to the organization and leverage that asset in a manner that provides greater understanding of the subject matter. The table in Figure 5 is a classification of the various levels of understanding with the corresponding technology.
Figure 6: Classification of Various Levels of Understanding with Corresponding Technology
| Level of Understanding || Technology |
| Data || Online transaction processing (OLTP) systems |
| Information || Ad hoc query and reporting applications |
| Analytic || Online analytical processing (OLAP) applications |
| Knowledge || Data mining applications |
| Wisdom || The human mind |
While artificial intelligence attempts to emulate the human thought process, no technology has been able to replace the human mind. Most organizations have transitioned from data to analytics. Only those organizations that understand the value of data and technology have advanced to knowledge and wisdom which has led to the competitive advantages they currently enjoy.
For more information on related topics visit the following related portals...
DW Administration, Mgmt., Performance,
Knowledge Mgmt. and
Jonathan Wu is a senior principal with Knightsbridge Solutions. He has extensive experience designing, developing and implementing information solutions for reporting, analysis and decision-making purposes. Serving Fortune 500 organizations, Knightsbridge delivers actionable and measurable business results that inform decision making, optimize IT efficiency and improve business performance. Focusing exclusively on the information management disciplines of data warehousing, data integration, information quality and business intelligence, Knightsbridge delivers practical solutions that reduce time, reduce cost and reduce risk. Wu may be reached at firstname.lastname@example.org.
Provided by IndustryBrains
|Design Databases with ER/Studio: Free Trial|
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.
|Data Quality Tools, Affordable and Accurate|
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.
|Free EII Buyer's Guide|
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.
|cost-effective Web server security|
dotDefender protects sites against DoS, SQL Injection, Cross-site Scripting, Cookie Tampering, Path Traversal and Session Hijacking. It is available for a 30-day evaluation period. It supports Apache, IIS and iPlanet Web servers and all Linux OS's.
|Click here to advertise in this space|