-
Marketplace
-
Channel Resources
Articles from this Site
DataFlux and SAS Introduce Project Unity
Experian QAS Announced QAS Email and Phone
Attachmate Unveils Verastream Host Integrator 6.6
Electronic Arts Out-Games the Competition with Informatica
Jaspersoft Delivers BI Development Platform for NetBeans and MySQL
White Papers
Why and How to Build a Continuous Integration Environment for the .NET Platform
Informatica - Handling Variable Length Files Using XML
Maximize Business Value with Right-Time Information Using Data Services
EAI - Refine the Economics of Integration
Profiling: Calculating Return on Investment for Data Migration and Data Integration Projects
Web Seminars
Espresso Shot: Optimize Sales and Marketing with Advanced Reporting and Dashboards
Trends and Tactics for Improving Data Quality
Getting In Synch: Creative Ways to Reconcile Data Between Apps
The Trouble with Success: Methods for Addressing Shrinking Batch Windows
Books
Is real-time integration or staging generally a better approach for enterprise-class software?
| Q: |
Question: Data level integration falls into two major categories: real time and staging. Which one is generally a better approach for enterprise-class software? If you choose real time, how do you deal with scalability issues and overload of application server? If you choose staging, how do you deal with synchronization, especially the lag in data updates? Is there a way to have the best of both worlds? Chuck Kelley's Answer: First, I am not a big fan of the real-time data warehouse although I am starting to change my opinion (it will still take more time!). You have picked two good points to list here which needs to be taken into consideration when making the decision. However, I believe that you need to understand what the business requirement is. Do they need (really!) real-time analysis or are they looking to use the data warehouse/data integration to get a new transaction system that they believe they need? Or do they really need real-time analysis? I happen the think that synchronization is less of an issue that scalability/overload, but I am sure others will disagree. Les Barbusinski's Answer: Generally speaking, not all information needs to be real time. Furthermore, not all real-time information has to be "real" real time ... it can be "near" real time (with some acceptable latency built in). Usually, the more strategic the information is, the more processing it will require (i.e., cleansing, integration, aggregation, statistical computation, pattern recognition, etc.) and the longer the latency factor will be. The more tactical the information is, the less processing it will require (i.e., "give it to me raw") and the less time it will take to deliver it. End users and manager are often willing to wait for strategic information, but will not tolerate delays in receiving tactical information. Given this perspective, you need to "triage" your integration requirements and apply the right technology for each requirement. The "raw" tactical data that needs to be available immediately can be culled from the database log or trapped by an online application, minimally integrated with some look-up data by an EAI tool, sped to the end user via a message bus and delivered to a dashboard application or a wireless device in seconds. The strategic data, on the other hand, must be trapped in periodic snapshots, heavily scrubbed, integrated with historical data, aggregated over time and interpreted (using ETL, BI and statistical analysis tools) before being delivered overnight (or periodically throughout the day) to the end user as a report, chart, e-mail, or alert. Each approach (real-time vs. batch) requires different hardware, software and methodologies. If you "triage" your integration requirement, you can mitigate the stress on the application servers, and still provide the " right information to the right person at the right time." Hope this helps. |
Chuck Kelley is an internationally known expert in database and data warehousing technology. He has 30 years of experience in designing and implementing operational/production systems and data warehouses. Kelley has worked in some facet of the design and implementation phase of more than 50 data warehouses and data marts. He also teaches seminars, co-authored four books on data warehousing and has been published in many trade magazines on database technology, data warehousing and enterprise data strategies. He can be contacted at chuckkelley@usa.net.
Les Barbusinski is vice president of technology and co-founder of Digital Symmetry, LLC, a consulting firm that specializes in the design and development of data warehousing and business intelligence solutions. He has more than 20 years of experience in data warehouse and operational systems development and provides hands-on expertise in data warehouse design, development and project management. Les can be reached at dwexpert@dsym.com.
For more information on related topics, visit the following channels:


