FREE DM Review Site Registration!
Sign-up today and access DM Review on the Web!

Your FREE registration entitles you to:

FREE email newsletters

FREE access to all DM Review content

FREE access to web seminars, resource portals, our white paper library and more!

   

Canonical Data Model

Design Challenge

There are a bunch of new buzzwords popping up in our field. One term I hear quite frequently lately is “canonical data model.” I recently explained this term to a fellow Design Challenger, yet I was interested in other explanations as well.

The Challenge

Please define “canonical data model” and give an example.

The Response

My definition of canonical data model, expanded with input from our Design Challengers is as follows:

The canonical data model is the definition of a standard organization view of a particular subject, plus the mapping back to each application view of this same subject. The standard organization view is built traditionally using simple yet useful structures. Employee and Contractor, for example, might be represented as Person Role; Order and Credit as Event; Warehouse and Distribution Point as Site. The canonical data model is frequently implemented as an XML hierarchy. Specific uses include delivering enterprise-wide business intelligence (BI), defining a common view within a service-oriented architecture (SOA) and streamlining software interfaces.

Figure 1 is an oversimplified example of the use of a canonical data model. The “before” view shows point-to-point interfaces that each need to be aware of how the target system sees its world. The “after” view, on the other hand, knows how each system sees its world, and therefore can translate between any two systems.


I’d like to explore the boldfaced terms in this definition in more detail.

Standard. Bob Schork, metadata architect, states, “Canonical means the accepted and only acceptable standard of a system. Likewise, a canonical data model would be the accepted structure for an application system. It promotes reusability.” Claire Frankel, EDM manager, equates canonical data model with the reference or ruling data model and states: “When referring to an enterprise, the canonical data model is the basic or fundamental logical model of the firm’s business. When referring to data modeling itself, a canonical data model is one of the known, industry-standard models for that industry or business.” Ralph Nijpels, business analyst, mentions that canonical models typically have company-wide scope that describes terms, their definitions and their relations in the language of the business. Nandi Iyer, solutions architect, agrees: “The canonical data model unifies information fragments at an enterprise level to facilitate consistent data usage for enterprise integration.”

Mapping. Instead of writing translators between each and every application, it is sufficient just to write a translator between each format and the canonical format. Craig Jordan, advisor, offers this analogy: “Some nations are comprised of people who speak many different tribal languages. In these cases, a national language can sometimes provide a means for communication between tribes that is not prejudiced toward any particular group. In the realm of information systems, the data or information models that are specific to a particular application are tribal, and one that is independent from them all is canonical.”

Simple yet useful. Sathsh Parameshwara, BI architect, says that the canonical data model is a generic data model that can be plugged into any platform without any dependency on applications used. Lee LeClair, senior system engineer, states, “The term means a data model that conforms to acceptable practices and is in its simplest form.” Steve White, information architect, adds, “A canonical data model is one that’s abstracted, that is to say not linked to a specific application.” Jeff Lawyer, senior data architect, adds, “A canonical data model is an overall, basic and generally indisputable data model for an enterprise, sufficiently high-level enough to be boundary, organization and application independent.”

Hierarchy. Jeff Pekrul, data architect, says that a canonical schema can be a physical model that is typically an XML schema (i.e., hierarchical) and intended for use in data integration applications. He states, “Much of the confusion about the term ‘canonical’ relates to the distinction between canonical schemas - typically XSDs - and logical data models from which these may or may not be derived.”

If you would like to become a Design Challenger and have the opportunity to submit modeling solutions, please add your email address at http://www.stevehoberman.com/. There is also an overview on how to read a data model at my Web site.


Steve Hoberman has worked as a business intelligence and data management practitioner and trainer since 1990. He is a Certified Business Intelligence Professional (CBIP), having achieved mastery level certification in data analysis and design. He is a popular and frequent presenter at industry conferences, both nationally and internationally.  Hoberman is a columnist and frequent contributor to industry publications, as well as the author of  Data Modeler's Workbench and Data Modeling Made Simple (available for purchase through the DM Review bookstore). He is the founder of the Design Challenges group, inventor of the Data Model Scorecard and a recognized innovator and thought leader in the field of data modeling. He can be reached at me@stevehoberman.com.

Graeme Simsion's latest book is out! Data Modeling Theory and Practice. Here's a link where you can read more about the book and purchase it at a discounted price.

For more information on related topics, visit the following channels:



Industry Vendors