DM Review Published in DM Review Online in November 2005.
Printed from DMReview.com


Thoughts from the Integration Consortium: SOA: When Application and Data Integration Worlds Collide

by Integration Consortium

This month's column is contributed by Jake Freivald, director with iWay Software.

It's not a perfect rule, but our clients generally focus on one of two integration styles: data or application. SOA is bound to change that, and soon.

Data integration specialists usually define data structures, especially marts, warehouses and operational data stores. Sometimes they're happy with virtual structures: although pundits demonized "virtual data warehousing" a decade ago, EII has made federated queries respectable again.

These structures enable particular use cases. They emphasize entities (as in entity relationship diagrams). They accept redundancy, as long as it is controlled and rationalized. And they make users learn their schema, which adds complexity but makes usage patterns as flexible as possible.

This last element is crucial: in data integration, people add their own intelligence to raw data. Nothing stands between the data and the programmer or analyst except an IDE or a query tool.

It's different with application integration specialists. They're usually process-oriented. Their data hides behind a wall of application logic, making them more concerned about interfaces than about data structures. They eschew redundancy; they're chartered to integrate existing applications, not to develop new ones. If a data warehouse already exists, they might use it - but they're certainly not interested in building a new one.

In other words, they ignore the data itself and trust the work of smart people - programmers, primarily - who have gone before them.

It seems that data integration and application integration have different goals, different tools and even different philosophies. Yet I believe that SOA will force them together. How is that possible?

Clearly, we're not giving up our data warehouses. They make too much sense: cleansed, reconciled, properly modeled data doesn't come easy, but when it is critical it is the only thing that will do the job. The underlying data structures still matter.

Therefore it is the interfaces that must change. SOA evolved from application integration's dependence on interfaces, so that now every business function or service has its own interface. In data integration scenarios, these interfaces will provide access to data sources.

That might make some of us pause and ask, "Don't we already have this?" ODBC and JDBC are interfaces that give us access to data. We can make Web services look somewhat similar with embedded SQL statements. So in what way will we change?

The answer lies in the information contained in the interface. ODBC and JDBC provide a generic data pipe between an application and a data source. The data source might be used by more-or-less well-understood business logic (e.g., a client/server or Web application) or by knowledgeable users of business intelligence tools. The use cases for the data were well understood. Administrators could tune their databases accordingly.

But not with SOA. Consider Web services, currently the most common way of implementing SOA. We build Web services because we want to open up our information systems to more users, more use cases, more queries - more portals and enterprise service buses (ESBs) and IDEs. By definition, we let authorized users consume services for whatever purpose they can conceive.

If we allow Web services to act as generic data pipes like ODBC, we'll have problems:

  • Difficult maintenance. Occasionally we have to change our warehouses' data models, especially if we share our data with many user communities. Even trivial structural changes may affect multiple data consumers - and we often won't even know who they are.
  • Bad data usage. Data warehouses should provide a so-called single version of the truth. If we open our databases up to people who aren't used to them, they'll start to make mistakes as they query the data model.
  • Unpredictable queries. Right now we can manage data use because we're in close contact with the application designers or BI users that use our data. As we open it up to more user communities, data usage will become increasingly difficult to predict.

All three problems have a common resolution: prepackage queries and expose them as services. The services we create will use the data model correctly and be completely predictable, and we'll often be able to keep the interface stable when we change the underlying data model. Developers usually won't mind, because they won't have to learn a data model to get what they need.

And that's application integration - data shielded by business logic, exposed through a standard interface. With SOA, data services will become mini-applications, created by data owners to make sure that their data gets used correctly. Data integration will require a veneer of application integration.

Now, this doesn't mean that classic data integration - unbridled querying of carefully defined data structures - will go away completely. It can't. There needs to be a balance, a set of best practices that helps us choose when to prepackage queries and when to let people run free through our databases.

That's why the Integration Consortium is so important. We need a forum for practitioners, vendors and end users to talk about what works, what doesn't and what hasn't been tried yet, so we can strike the right balance when our turn comes to work on a new problem. Start the conversation by visiting www.integrationconsortium.org.

Jake Freivald is a director with iWay Software, an Information Builders company that specializes in integration and SOA. During his tenure there, he has balanced his time between data integration for business intelligence and application integration.


The Integration Consortium is a non-profit, leading industry body responsible for influencing the direction of the integration industry. Its members champion Integration Acumen by establishing standards, guidelines, best practices, research and the articulation of strategic and measurable business benefits. The Integration Consortium's motto is "Forging Integration Value." The mission of the member-driven Integration Consortium is to establish universal seamless integration which engages industry stakeholders from the business and technology community. Among the sectors represented in the Integration Consortium membership are end-user corporations, independent software vendors (ISVs), hardware vendors, system integrators, academic institutions, non-profit institutions and individual members as well as various industry leaders. Information on the Integration Consortium is available at www.integrationconsortium.org.

Copyright 2007, SourceMedia and DM Review.