Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Information Is Your Business
   Information Is Your Business Advanced Search

Business Intelligence
Corporate Performance Management
Data Management
Data Modeling
Data Quality
Data Warehousing Basics
Master Data Management
View all Portals

Scheduled Events

White Paper Library
Research Papers



DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Document Warehousing & Content Management:
The Evolution of Content and Collaboration

  Column published in DM Review Magazine
February 2002 Issue
  By Dan Sullivan

The potential for highly effective collaboration depends upon the ability to exchange relevant content rapidly and at a relatively low cost. Portals have made significant inroads in this area by providing frameworks for delivering content through personalized distribution, access controls and single points of access to a variety of corporate data sources. Portlets, or small, targeted programs designed to operate within a portal, are a common method for tying disparate systems together in a single portal interface. No one can argue with the success of portals as they exist today, and we should not expect the portlet model to change significantly. The real changes in the world of collaboration and content management will be well below the interface level of portals in the architecture of the services that provide the content for the user.

The first significant change is already widely acknowledged - the rise of Web services in distributed system design. The basic idea is that distributed applications use standard mechanisms for common interactions. For example, the simple object access protocol (SOAP) is used to invoke services and return results; the Web services definition language (WSDL) is used to describe services; and the universal description, discovery and integration (UDDI) protocol provides the means to publish, find and bind to different services. Together, these standards allow developers to package services and make them widely available.

Web service wrappers can be used with existing systems to open access to content sources. Relational and XML databases, content management systems, document management systems and other repositories could all be accessed in a standardized way. Any reasonable portal search tool can crawl file systems, databases and just about any place we can store content. Why should we need to change the way we access and index content? That brings us to the second major change, the emergence of standards for the open exchange of content meta data.

The Dublin Core www.dublincore.org is a meta data standard with the goal of creating intelligent information discovery systems. It includes commonly used elements to describe a resource such as: title, creator, subject, description, publisher, contributor, date, type, format, uniform resource identifier (URI), source, language, coverage and rights. The Dublin Core Metadata Initiative (DCMI) has recommendations or working drafts specifying how to describe the Dublin Core in HTML and resource descriptor format (RDF). A business special-interest group was recently formed within the DCMI to address the use of the Dublin Core in the commercial sector.

The advantages of including HTML or XML embedded meta data in Web resources is obvious - it provides extensive and precise descriptions of content that can be used for more effective searching and navigating. Search engines often use meta data tags, especially the description field, to more accurately classify content found on the Web. From a collaboration perspective, portal and content management applications can use the meta data to better identify relevant information for a particular user. More importantly, it opens the doors to better management of distributed content repositories.

Many organizations use enterprise-wide search engines to crawl their entire intranet and index internal content. This works well in many cases, given the practical limits of keyword indexing and statistical pattern matching. This model of crawling and indexing stops at the corporate borders, and therein lies the opportunity for a new model of resource discovery based upon Web services and meta data. Rather than crawling, resource discovery can move to harvesting. Harvesting gathers meta data about content rather than content itself. The Open Archives Initiative (OAI) Protocol for Metadata Harvesting www.openarchives.org/OAI_protocol/openarchivesprotocol.html is one example of a harvesting protocol. The OAI protocol is based on HTTP and has been adopted for digital library, museum and other scholarly projects. OAI adopters face challenges in organizations engaged in business-to-business collaboration: a number of distinct organizations control access to valuable content, the content is managed on a range of decentralized platforms and users need a mechanism for discovering particular types of content. The OAI Protocol for Metadata Harvesting is one method of addressing this; harvesting based on a Web- services model is another.

The evolution of collaboration and content management will require the ability to effectively discover and access content across corporate boundaries. Businesses are not likely to open their firewalls to partners who want to crawl their file systems and databases; therefore, a better model is needed. The model must allow owners of content to control how it is published, and it must provide the means for potential users to discover resources. Content meta data published through Web services is the next step in the evolution of collaborative systems.

For more on the use of unstructured content in collaborative and decision support systems, see Web Farming for the Data Warehouse (Hackathorn, 1999) and Document Warehousing and Text Mining (Sullivan, 2001). Architecting Web Services (Oellermann, 2001) provides an extensive overview of Web services. The Open Archives Initiative promotes interoperability standards for content.


For more information on related topics visit the following related portals...
Enterprise Application Integration (EAI), Enterprise Information Portal (EIP), Enterprise Intelligence and Content Management.

Dan Sullivan is president of the Ballston Group and author of Proven Portals: Best Practices in Enterprise Portals (Addison Wesley, 2003). Sullivan may be reached at dsullivan@ballstongroup.com.

Solutions Marketplace
Provided by IndustryBrains

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

SAP & TommorrowNow Support
SAP & TomorrowNow offer competitor users full support cost savings through 2015 up to 50%. You get the time you need to make informed decisions about future migrations.

Validate Data at Entry. Free Trial of Web Tools
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

DeZign for Databases - Database Design Made Easy
Create, design & reverse engineer databases with DeZign for Databases, a database design tool for developers and DBA's with support for Oracle, MySQL, MS SQL, MS Access, DB2, PostgreSQL, InterBase, Firebird, NexusDB, dBase and Pervasive.

Data Mining: Levels I, II & III
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Click here to advertise in this space

View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.