Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Knowledge: The Essence of Meta Data:
RSS Technology - Evolution, Revolution and Extinction

online columnist R. Todd Stephens, Ph.D.     Column published in DMReview.com
November 17, 2005
  By R. Todd Stephens, Ph.D.

Recently, I posted a blog entry (http://www.rtodd.com/blog/) stating that Really Simple Syndication (RSS) technology would replace the need for a meta data repository in the near future. Needless to say, I got plenty of comments, linkbacks and e-mails; most of which ... (I won't complete the sentence to keep from skewing your opinion). This month, I wanted to provide an overview of RSS technology as well as the feed aggregator utility to see if you agree with me on the possibility of repository extinction.

What is RSS?

RSS is a lightweight XML format designed for sharing headlines and other Web content. Think of it as a distributable "what's new" for your site. Originated by UserLand in 1997 and subsequently used by Netscape to fill channels for Netcenter, RSS has evolved into a popular means of sharing content between sites. RSS solves myriad problems Webmasters commonly face such as increasing traffic, and gathering and distributing news. RSS can also be the basis for additional content distribution services (Web Reference, 2005). The easiest why to describe an RSS feed is to provide an example and walk through the details.

Basically, there are two main elements of the news feed: channel and item. The channel is the high-level description of the information source. The basic attributes of the channel include the title, link and description. RSS 2.0 includes a large assortment of optional attributes including language, copyright, managingEditor, webMaster, pubDate, lastBuildDate, category, generator, docs, cloud, ttl, image, rating, textInput, skipHours and skipDays. The item element provides the detail information that users want to see and like the channel element, includes three basic attributes: title, link, and description. Additional attributes defined by RSS 2.0 include: author, category, comments, enclosure, guid, pubdate and source. Detailed descriptions of these elements and attributes can be found on the Harvard Law Web site (http://blogs.law.harvard.edu/tech/rss).

Take a look at the sample RSS feed in Figure 1.

Figure 1

This feed includes a single channel with four attributes (title, link, description and language). Three item elements are included with the three basic attributes (title, link and description). Would it be possible to envision a logical model RSS feed as that shown in Figure 2?

Figure 2

The Use of RSS Technology

The typical use of the RSS feed is within the blog environment. Once the author updates their blog with an entry, the system will update the RSS file and send a "ping" message to the aggregation ping server indicating that his site has updated. Several organizations such as Feedster and Technorati will monitor the blog feeds and publish the information in a centralized location for content aggregation. The other option is that end users can simply purchase or download a news aggregator application (reader), which allows the user to subscribe to any blog that supports the RDF/XML feed. The application can check the blog for updates once an hour or once a day depending on the configuration of the reader. This eliminates the need to engage search engines or news collection sites in order to read the content from a specific source of information. From Wikipedia (2005), we can define the RSS newsreader as a news aggregator, or simply aggregator. The reader is a software application that collects syndicated content, such as RSS and other XML feeds from blogs and mainstream media sites. Aggregators improve upon the time and effort needed to regularly check Web sites of interest for updates, creating a unique information space or "personal newspaper." The content is sometimes described as being "pulled" to the subscriber, as opposed to "pushed" with e-mail or Instant Messenger (IM). Unlike recipients of some "pushed" information, the aggregator user can easily unsubscribe from a feed.

Implications to the Meta Data Environment

The implications for the meta data environment are enormous. Taking a closer look at RSS standard reveals that simplicity and consistency are critical irregardless of context. This indicates that a simple metamodel, such as the Dublin Core, could be easily exchanged by the use of RSS technology. Several of the sample feeds, included at the bottom of this article, contain Dublin Core expansions. Newsreaders could replace the majority of the functionality currently held within the centralized meta data repository. Publishing new content is very similar to the information required for publishing technology asset meta data or will be in the near future. Advancements in the RSS technology will allow code objects, analysis documents, modeling artifacts and other system development life cycle products to publish information about the assets automatically. This will eliminate the need for the extraction of information by hand or forcing integration into a single methodology. RSS already has search functionality and personal taxonomies where the end user can catalog their own content which may prove to be much more valuable than the traditional IT based taxonomies.

Why Not?

Why can't we move forward with RSS technology? At a high level, RSS and aggregator technology has two flaws. The first is that none of the news items can be directly related to each other. They can be related by semantic methodologies such as search and content inference, but the basic relationship is not included in the feed itself. So while I can provide an RSS feed from a logical model (channel) and include the entities and attributes (items), I lose the relationships between these elements. Of course, you could expand the feed to include classification entries and move to a more hierarchal item structure but that isn't really in use today. Second, most assets within the enterprise are currently described by a complex metamodel such as OMG, CWM, RAS or IDDU. The standards are specifically designed for the asset(s) they are describing. That level of complexity does not exist in the current RSS standard. Clearly the news aggregator doesn't have some of the complex business functions as impact analysis or unstructured document association. This view constrains us to what is currently available but does not allow us to see the vision of what is possible. In my mind, we are not very far from having a standard where applications that generate meta data can publish in the RSS format. The RSS movement is not a threat but rather an opportunity to take meta data to the next level. Perhaps the best use of RSS is simply to integrate the disparate repository applications and leave the core meta data home. Thoughts?

Sample RSS Files

Tom Peters: http://www.tompeters.com/index.rdf

Nicholas Carr: http://www.roughtype.com/index.xml

R. Todd Stephens: http://www.rtodd.com/rss.htm

Louis Rosenfeld: http://louisrosenfeld.com/home/index.xml

Claudia Imhoff: http://www.b-eye-network.com/blogs/imhoff/rss_feed.php

David Loshin: http://www.b-eye-network.com/blogs/loshin/index.xml

David Allan: http://www.davidco.com/blogs/david/index.xml

For more information on related topics visit the following related portals...
Meta Data.

R. Todd Stephens, Ph.D. is the director of Meta Data Services Group for the BellSouth Corporation, located in Atlanta, Georgia. He has more than 20 years of experience in information technology and speaks around the world on meta data, data architecture and information technology. Stephens recently earned his Ph.D. in information systems and has more than 70 publications in the academic, professional and patent arena. You can reach him via e-mail at Todd@rtodd.com or to learn more visit http://www.rtodd.com/.

Solutions Marketplace
Provided by IndustryBrains

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Data Quality Tools, Affordable and Accurate
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

dotDefender protects sites against Web attacks
30-day evaluation period for dotDefender, a high-end cost-effective security solution for web servers that protects against a broad range of attacks, is now available. dotdefender supports Apache, IIS and iPlanet Web servers and all Linux OS's.

Click here to advertise in this space

E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.