Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

High Performance Marketing:
An ETL Framework for Marketing Databases

online columnist Steve Schultz     Column published in DMReview.com
May 12, 2005
  By Steve Schultz

Steve would like to thank Paul Becker, president of Quaero Corp., for contributing this month's column.

Last month we discussed how the design of marketing databases differed from other data warehouse designs. This month we address a closely related topic, the process for updating the content of a marketing database. In data warehousing parlance, this process is known as extract, transformation and load, and commonly referred to as ETL. Figure 1 shows the nine fundamental steps in the marketing database ETL process.

Figure 1

The first step, transaction update, is an accumulation of all the customer activity around the enterprise since the last update cycle. This data is the basis for what is known as the "360-degree view of the customer." Typically this data is sourced from multiple operational systems that support sales, product delivery, customer service and other functions across the various interaction channels (stores, telephone, Web, kiosk, mail, etc.). This act of gathering similar data from disparate systems into a single common semantic format is the traditional challenge of data warehousing. In fact, in many marketing databases, this step is performed by the enterprise data warehouse system, which then acts as a single source of high quality, normalized data on customer behavior for the marketing database. In organizations where there is no enterprise data warehouse, this step must be performed by the marketing database ETL. The techniques for performing this step are well documented in the data warehousing literature so we will not elaborate here.

Perhaps the most important type of data in a marketing database is the customer contact information, such as name, e-mail address(es), phone number(s) and mailing address(es). This information in its raw form is collected through customer requests for information, account applications, the sales and ordering process and other transactions. It is available in the marketing database as a result of the transaction update step. Customer contact data quality typically is highly variable due to human participation in the data recording process. The goal of the customer contact data hygiene step is to validate and standardize this data to the greatest extent possible. Contact data hygiene functions include name and address format standardization, address delivery point validation, national change of address (NCOA) processing and reverse phone number append processing (name and address look-up from phone number). Note that processes that utilize external sources should be applied against both current cycle activity records and applicable records from previous update cycles. For example, in a reverse phone number append situation, appends should be reattempted for previously unmatched phone numbers in order to maximize the number of customer names. Customer contact data hygiene also must include the application of contact suppression indicators from both internal sources and external lists (e.g., from the national "do not call" list). The hygiene processes are important for two primary reasons. The first is that accurate contact information saves time and money during the direct marketing process. The second is that comparison of the customer contact information present on two transactions from disparate systems is often the only means to match those transactions to the same individual. Thus, it is critical that all aspects of the data hygiene step be performed prior to the customer matching process.

Customer matching links the individual activity records produced during the Transaction Update step using the clean contact information plus any available internal customer codes (e.g., account numbers or loyalty program numbers). The result is a 360 degree view of the customers' behavior, over time and across product lines and interaction channels. The algorithm used to match customer activity records to each other typically is an inexact matching process that takes into account many of the available customer contact data items to determine if two activity records with names, addresses and phone numbers that differ somewhat are in fact for the same customer. When designing this part of the ETL process, it is important to match new activity records against both the previously existing customers in the database and against each other.

The data consolidation step combines the various data from the activity records with any existing data for each individual in order to produce an up-to-date customer profile. The process typically takes into account the timestamp of the activity record as well as the trustworthiness of the data source generating the activity record when resolving multiple values from different activity records into a single customer profile data item.

Householding is the process of grouping multiple customers who either reside at the same location or are linked to each other in other ways in the atomic data into marketing households. As with customer matching, householding is often an inexact matching algorithm. It usually involves some combination of mailing address and last name, SSN if available and DUNS information if your customer is a business. Householding often includes a determination of a single customer as "head of household," or possibly one customer as "female head of household" and one customer as "male head of household" in gender-targeted marketing situations. The householding ETL logic may include computation of household level attributes and metrics derived from the multiple profiles of the customers comprising the household.

Response determination compares activity in the current cycle against previous contact history to determine if any of that activity constitutes a response to prior communications. Where offer codes are captured at the point of sale and recorded with sales transactions, response determination is straightforward. In many cases, however, less precise rules are used to infer a response, such as specific customer activities occurring shortly after a direct marketing contact prompting those activities. The ETL logic to execute these rules should be processed after the householding step so that response can be recorded at both the individual and household level.

Data enrichment overlays data acquired from external sources on the profiles of individuals and households in order to improve segmentation and targeting. The process of data enrichment typically involves extracting a file of names and addresses, sending that file to the data vendor(s), receiving the purchased data attributed back from the vendor(s) and loading those attributes into the marketing database. The process for each cycle should include steps to enrich data for customers who are new to the database in that cycle and to refresh attributes for all individuals on a periodic basis.

At this point in the cycle, all relevant atomic data has been updated in the marketing database and organized by individual and household. Marketing segmentation often involves more complex metrics spanning simple aggregations of activity over time to complex statistical models designed to predict specific behaviors. A segmentation metric often originates during the planning of a particular marketing campaign, gets reused for other campaigns and eventually gets migrated into the production ETL stream if it has broad marketing applicability. The customer scoring ETL step computes each of these production level metrics. Considerations for this step include the amount of metrics history to keep (e.g., customer profitability changes over time) and the frequency of recomputation for each metric (e.g., are scores for a model recomputed when model inputs change or periodically regardless of atomic data change).

One of the most important types of content in a marketing database is the history of direct marketing contacts originated by the organization. This history may be obtained from various sources including internal "customer touchpoint" systems, third-party direct marketers (e.g., direct mail houses, e-mail providers and telemarketers) and campaign management software applications. The contact history update ETL step must ensure the complete record of communications for each individual and household is linked to all other information in the marketing database about that individual and household, and linked to the other major dimensions of the database such as product, promotion and marketing channel. This step may involve ETL to move promotion data from campaign development systems into the marketing database to accomplish the contact-to-promotion linkage.

The marketing database ETL framework discussed above provides an architectural guide and checklist for designing an ETL subsystem for any marketing database used to perform relationship marketing.

Quaero President Paul Becker brings 25 years of experience developing and implementing complex information systems, with the most recent 10 years focused on delivering CRM solutions for Fortune 500 clients in the financial services and telecommunications industries. Becker leads Quaero's talented staff of marketing, analytic and technology consultants to provide integrated CRM solutions to Quaero's clients. Quaero's service offerings are focused on leveraging client's customer information to enhance the value of their customer relationships. Specific services include customer data repository implementation, marketing automation, marketing analytics and customer contact programs. For more information, visit www.quaero.com.


For more information on related topics visit the following related portals...
Database Marketing and ETL.

Steve Schultz is a leading customer relationship management (CRM) practitioner who combines an understanding of information technology with extensive business process design experience and information-based decision-making methodologies. As executive VP of Client Services for Quaero (www.quaero.com), he helps clients identify, justify, implement and leverage leading edge analytical CRM environments to create or/and improve their database marketing capabilities. Schultz has worked with companies in the financial services, telecommunications, retail, publishing and hospitality industries. Contact him at schultzs@quaero.com.

Solutions Marketplace
Provided by IndustryBrains

Customer Relationship Management for IT
Web-based CRM and more with Autotask: Great business management software optimizes resources and track billable project and service work. Get a demo, then try it free with sample data. Click here for your free trial!

Numara Track-It! Help Desk & CRM Software
Numara IT Solutions provides Track-It! - the leading help desk software solution for employee & customer self-help, call tracking, problem resolution, remote control, asset management, LAN/PC auditing, and electronic software distribution. Free demo

Stop Bad Data from Entering Your CRM Database
Protect against fraud, waste and high marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone checking, name parsing and geocoding as needed. FREE trial of Data Quality dev tools here.

ACT! CRM Helps You Help Clients.
Discover how ACT! Database automation can make your life easier. Learn More.

Help Desk On-Demand
CRMdesk automates online customer support and knowledge base creation through the Internet. It is a highly efficient technology solution for companies looking to dramatically improve quality and reduce costs of Web-based customer support.

Click here to advertise in this space

E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.