Document Warehousing & Content Management:
InStranet Brings the Unstructured to BI
In previous columns we've looked at similarities between structured and unstructured data in business intelligence (BI), and I've argued the two are not all that different. This month, I'll hold off on the architectural and design discussions and instead look at an example implementation and one of the first commercial products for BI-oriented content management.
InStranet brings one of the pillars of business intelligence, multidimensional modeling, to bear on the problem of integrating content with traditional BI applications. The basic idea behind InStranet's flagship product, InStranet V2, is that documents can be organized, managed and disseminated using a multidimensional model of meta data. The parallels with structured BI are clear. Just as we slice and dice numbers along fixed dimensions, users want to find content based upon dimensions such as customer, partner, time and product type. Similarly, administrators can use the same dimensions to control access based upon line of business, client responsibilities and authority levels. This idea of applying multidimensional modeling to content management, combined with scalable implementations built on relational database platforms such as Oracle and DB2, is opening new ways of thinking about documents, business intelligence and customer relationship management (CRM).
One InStranet user, a major insurance company, had successfully deployed a widely used and effective reporting system based upon their data warehouse and Business Objects. The problem was they still were not meeting all of their users' requirements. In an industry such as insurance where 80 percent of the information exchanged between the insurer and customers is unstructured, providing only structured information leaves users fending for themselves to piece together the whole decision support picture. Customers, agents and even internal staff needed access to policies, contracts and claims as well as the structured data provided by the data warehouse. The insurer seized the opportunity to improve customer retention by offering CRM- like functionality along with the data warehouse reporting. Within an Enterprise Information Exchange (InStranet's term for a business application) users can examine claim statuses, review policy details and exchange information with each other from a single point. As this insurer found, facing the problem of integrating unstructured texts with traditional business intelligence applications can directly impact the bottom line. In this case, the benefit came in the form of improved customer retention.
There are several steps to creating an enterprise information exchange. First, dimensions are defined for organizing content. In some cases, these can be used directly from a data warehouse or OLAP application. In other cases, dimensions will be created specifically for the enterprise information exchange (e.g., access control information). Once the dimensions are defined, documents are tagged with XML-based meta data. The meta data reflects where the document falls within the organizational hierarchy, the document type, author, audience and other administrative information. At this point, automatic categorization and related feature extraction tools are not available. Administering the exchange includes defining user profiles. This is not strictly necessary, but personalization is one of the key benefits of the system. The final step is creating links to other business intelligence, LDAP and related applications. Since the core product is J2EE-compliant and XML- oriented, integration barriers are minimized.
InStranet V2 falls within the scope of the broad enterprise information portal market; but to put it into a more precise perspective, we'll use IDC's model of the enterprise portal evolution, which is divided into three waves. In the first wave, portals were fundamentally user interface integration tools. In the second wave, the focus is on equal but nonintegrated access to both structured and unstructured data. In the third wave, structured and unstructured data access is unified. Achieving this level of integration requires shared meta data, and dimensional models are particularly good representations for this. (See my July 2001 DM Review article for more detail.) Built from the ground up to support structured integration based upon a dimensional meta data model, InStranet V2 is clearly in the third wave class of tools.
Who will benefit from applying multidimensional techniques to content management? You are definitely a candidate if you need to exchange documents with large numbers of customers, suppliers and partners. Large organizations with many business units will also benefit. This is especially true when multiple points within an organization service large customers. For example, does an umbrella contract created in a major accounts sales department dictate terms for services provided to that customer's subsidiaries that are, in turn, serviced by a regional sales force? Are those contracts on a file server? Are they distributed to the different sales groups via e-mail and FTP? If this sounds familiar, then a more structured approach could be in order. Finally, if you find yourself chasing documents to explain anomalous trends and figures in a data warehouse report, then it's time to consider document integration with your other BI applications.
For more information on related topics visit the following related portals...
Business Intelligence (BI),
Enterprise Information Portal (EIP),
Data Integration and
Dan Sullivan is president of the Ballston Group and author of Proven Portals: Best Practices in Enterprise Portals (Addison Wesley, 2003). Sullivan may be reached at email@example.com.
Provided by IndustryBrains
|KOM Networks Archiving and Data Storage|
KOM Networks, a leader in archiving and data storage for more that 37 years, offers organizations a cost effective means to secure their growing data stores.
|Speed Databases 2500% - World's Fastest Storage|
Faster databases support more concurrent users and handle more simultaneous transactions. Register for FREE whitepaper, Increase Application Performance With Solid State Disk. Texas Memory Systems - makers of the World's Fastest Storage
|Data Mining: Levels I, II & III|
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.
|Design Databases with ER/Studio: Free Trial|
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.
|Free EII Buyer's Guide|
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.
|Click here to advertise in this space|