-
Marketplace
-
Channel Resources
Articles from this Site
Dazed and Confused - Product Roadmaps, Upgrades and Migrations: BI and DW Trends, 8 for '08
Assessments - A Mandatory BI/DW Pit Stop
Can anyone share some experience as to how to test a data warehouse?
When is it better to have normalized data to create data marts and when is it better to have dimensional data?
ICICI Bank Standardizes on Informatica
White Papers
Spend Data Warehouse on Steroids
An Architected Approach to Integrated Information
KALIDO Business Modeling
Data Warehouse Lifecycle Management
Data Warehousing Ensuring Data Integrity
Books
The Enterprise Data Warehouse: Planning, Building, and Implementation
Enterprise One to One: Tools for Competing in the Interactive Age
Data Warehousing Advice for Managers
The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses
90 Days to the Data Mart
When do you use a star schema and when to use a snowflake schema?
Question: Could you please let me know when to use a star schema and when to use a snowflake schema? What are the major differences between them?
Chuck Kelleys Answer: My personal opinion is to use the star by default, but if the product you are using for the business community prefers a snowflake, then I would snowflake it. The major difference between snowflake and star is that a snowflake will have multiple tables for a dimension and a start with a single table. For example, your company structure might be
Corporate à Region à Department à Store
In a star schema, you would collapse those into a single "store" dimension. In a snowflake, you would keep them apart with the store connecting to the fact.
Joe Oates Answer: First of all, some definitions are in order. In a star schema, dimensions that reflect a hierarchy are flattened into a single table. For example, a star schema Geography Dimension would have columns like country, state/province, city, state and postal code. In the source system, this hierarchy would probably be normalized with multiple tables with one-to-many relationships.
A snowflake schema does not flatten a hierarchy dimension into a single table. It would, instead, have two or more tables with a one-to-many relationship. This is a more normalized structure. For example, one table may have state/province and country columns and a second table would have city and postal code. The table with city and postal code would have a many-to-one relationship to the table with the state/province columns.
There are some good for reasons snowflake dimension tables. One example is a company that has many types of products. Some products have a few attributes, others have many, many. The products are very different from each other. The thing to do here is to create a core Product dimension that has common attributes for all the products such as product type, manufacturer, brand, product group, etc. Create a separate sub-dimension table for each distinct group of products where each group shares common attributes. The sub-product tables must contain a foreign key of the core Product dimension table.
One of the criticisms of using snowflake dimensions is that it is difficult for some of the multidimensional front-end presentation tools to generate a query on a snowflake dimension. However, you can create a view for each combination of the core product/sub-product dimension tables and give the view a suitably description name (Frozen Food Product, Hardware Product, etc.) and then these tools will have no problem.
Chuck Kelley is an internationally known expert in database and data warehousing technology. He has 30 years of experience in designing and implementing operational/production systems and data warehouses. Kelley has worked in some facet of the design and implementation phase of more than 50 data warehouses and data marts. He also teaches seminars, co-authored four books on data warehousing and has been published in many trade magazines on database technology, data warehousing and enterprise data strategies. He can be contacted at chuckkelley@usa.net.
Joe Oates is an internationally known speaker, author and consultant on data warehousing. Oates has more than 30 years of experience in the successful management and technical development of business, real-time and data warehouse applications for industry and government clients. He has designed or helped design and implement more than 30 successful data warehouse projects.
For more information on related topics, visit the following channels:


