Data Warehousing Lessons Learned:
Special Issues with Data Warehousing Security
Data warehousing systems present special security issues including:
- The degree of security appropriate to summaries and aggregates of data,
- The security appropriate for the exploration data warehouse, specifically designed for browsing and ad hoc queries, and
- The uses and abuses of data encryption as a method of enhancing privacy.
Many data structures in the data warehouse are completely devoid of sensitive individual identities by design and, therefore, do not require protection appropriate for the most private and sensitive data. For example, when data has been aggregated into summaries by brand or region, as is often the case with data warehousing, the data no longer presents the risk of compromising the private identities of individuals. However, the data can still have value as competitive intelligence of market trends, and thus requires careful handling to keep it out of the hands of rival firms. Relaxed security does not mean a lack of commitment to security. The point is that differing levels of security requirements ought to remind us that one-size-fits-all solutions are likely to create trouble.
Another special security problem presented by data warehousing is precisely the reason why such systems exist. Data warehouses are frequently used for browsing and exploring vast reams of data -- undirected exploration and knowledge discovery is provided by an entire class of data mining tools. The point is to find novel combinations of products and issues. Whether authentic or mythical, the example of market basket analysis whereby diapers are frequently purchased with beer is now a classic case. The father going to the convenience store for "emergency" disposable diapers and picking up a six-pack on the way out suggests a novel product placement. The point is that it is hard to say in advance what restrictions would disable such an exploratory data warehouse; therefore, the tendency is to define an unrestricted scope to the exploration. A similar consideration of undirected knowledge discovery applies to simple ad hoc access to the data warehouse. Examples where a business analyst uses end-user self-service tools such as those by Business Objects, Information Builders, Cognos or Oracle to issue queries directly against the data without intermediate application security give the end user access to all the data in the data warehouse. Given privacy and security imperatives, it may be necessary to render the data anonymous prior to unleashing such an exploratory, ad hoc process. That will create complexity where the goal is (sanctioned) cross-selling and up-selling. The identity must be removed in such a way that it can be recovered, as the purpose is often to make an offer to an individual customer.
Encryption of data has its uses, especially if the data must be transmitted over an insecure media such as the Internet. There, Secure Sockets Layer (SSL), which is an implementation of X.509 public-private key cryptography, serves well in transmitting credit card numbers, passwords and other identifying data. However, encryption is a poor method of access control. An employee, his or her manager and the human resources clerk all require access to the employee's record. Therefore, encrypting the data will not distinguish between their access levels. It is misguided to believe that if encrypting some data improves security, encrypting all the data improves security even more. Blanket, global encryption degrades performance, lessens availability and requires complex encryption key administration. Encryption is a computationally intense operation. It may not impact performance noticeably when performed for one or two data elements; but when performed arbitrarily for an entire table, the result may very well be a noticeable performance impact. It might make sense to encrypt all the data on a laptop PC that is being taken off site if the data is extremely sensitive. If the PC is lost or stolen, only encryption will guarantee that the data is not compromised. However, an even better alternative would be selective encryption and organizational steps to make sure the physical site is secure and the media containing the data is handled diligently.
As a general rule, proven security practices and solutions developed to secure networks will be appropriate and extended to protect the data warehouse. In other cases, data warehouses present special challenges and situations because the data is likely to be the target that encourages hackers to try to gain access to the system. These practices extend from organizational practices to high-technology code. The requirement for authentication implies certain behavior - on-site staff should wear their corporate identification badges and be required to sign an agreement never to share a user ID or password with anyone. Based on your enterprise's specific confidentiality rules, make selective use of new areas where technologies are still emerging. It is essential that database administrators work together with their security colleagues to define policies and implement them using the role-based access control provided with the standard relational database data control language (DCL).
For more information on related topics visit the following related portals...
DW Basics and
Lou Agosta, Ph.D., joined IBM WorldWide Business Intelligence Solutions in August 2005 as a BI strategist focusing on competitive dynamics. He is a former industry analyst with Giga Information Group, has served as an enterprise consultant with Greenbrier & Russel and has worked in the trenches as a database administrator in prior careers. His book The Essential Guide to Data Warehousing is published by Prentice Hall. Agosta may be reached at LoAgosta@us.ibm.com.
Provided by IndustryBrains
|Design Databases with ER/Studio: Free Trial|
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.
|Manage Data Center from Virtually Anywhere!|
Learn how SecureLinx remote IT management products can quickly and easily give you the ability to securely manage data center equipment (servers, switches, routers, telecom equipment) from anywhere, at any time... even if the network is down.
|Data Validation Tools: FREE Trial|
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.
|Data Mining: Levels I, II & III|
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.
|Free EII Buyer's Guide|
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.
|Click here to advertise in this space|