-
Marketplace
-
Channel Resources
Articles from this Site
SAS Unveils Solution for Managing Virtual Server Environments
IE Discovery Announces Update to InfoDox
JMR Construction Builds Virtual Infrastructure with DataCore
Composite Software Releases Data Integration Strategy Recommendation Tool
IBM Introduces New Informix Dynamic Server
White Papers
Pragmatic Approach to Compliance Data Collation
Informatica - Handling Variable Length Files Using XML
Putting Metadata to Work to Achieve the Goals of Data Governance
Enterprise Information Management - Insights and Strategies into the Direction of EIM
Automated Analysis Technology
Web Seminars
Making the Business Case for Predictive Analytics: Innovative Strategies for Maximizing ROI
Master Data Management: Best Practices for Success
Modeling Unstructured Data
Books
Data Management: Databases and Organizations, 3rd Edition
Data Modeler's Workbench: Tools and Techniques for Analysis and Design
Effective Databases for Text & Document Management
Mobile Handheld Devices - Enabling Enterprise Communications and Data Management
Mobile Data Management (MDM 2002), 3rd International Conference
Which CDC method is the best to achieve staging database with changed data?
Question: I'm trying to implement change data capture (CDC) on the Oracle Source Database to achieve the real-time staging database and loading this changed data into a specific "data layer" through the extract, transform and load (ETL) tool for the intra-day reporting. I found two types of methods for CDC: synchronous and asynchronous. Asynchronous further classified into four methods: hotlog, distributed hotlog, autolog archived and autolog online.
I've implemented all the methods on small source databases, but Im not able to judge the performance of the individual methods.
Now, my query is: Which method is the best to achieve my goal (staging database with changed data) taking into consideration the issues of performance, latency, impact on source database etc.?
Chuck Kelleys Answer: I think that it depends on your requirements and how long before it has to be in the data warehouse. If it must be immediate and always in sync with the source system, then use synchronous. The negative is that it will slow down your source systems, since you are in effect applying a two phase commit between source and staging. I have found that, in most cases, distributed hotlog is probably the best of the four asynchronous methods.
Chuck Kelley is an internationally known expert in database and data warehousing technology. He has 30 years of experience in designing and implementing operational/production systems and data warehouses. Kelley has worked in some facet of the design and implementation phase of more than 50 data warehouses and data marts. He also teaches seminars, co-authored four books on data warehousing and has been published in many trade magazines on database technology, data warehousing and enterprise data strategies. He can be contacted at chuckkelley@usa.net.
For more information on related topics, visit the following channels:


