Data Warehouse vs Data Mart
A Data Warehouse is a place where data can be stored for more convenient mining. This generally will be a fast computer system with very large data storage capacity. Data from all the company's systems is copied to the Data Warehouse, where it will be scrubbed and reconciled to remove redundancy and conflicts. Complex queries can then be make against the Warehouse information storage.
A Data Mart is an index and extraction system. Rather than bring all the company's data into a single warehouse, the data mart knows what data each database contains and how to extract information from multiple databases when asked.
A D V E R T I S E M E N T
The basic difference between a data warehouse and a data mart is that the former is usually designed with an enterprise perspective, while the latter is usually created with a built-in organizational or functional bias (i.e., it is designed to generate a particular set of metrics from a specific business perspective). Data marts frequently obtain all or most of their data from a data warehouse.
Creating and maintaining a Data Warehouse is a huge job even for the largest companies. It can take a long time and cost a lot of money. In fact, it is such a major project companies are turning to Data Mart solutions instead.
Creating a Data Mart can be considered the "quick and dirty" solution, because the data from different databases is not scrubbed and reconciled, but it may be the difference between having information available and not having it available.
Another difference is that a data warehouse consists of many different types of data structures (staging, ODS, extracts, etc.), while a data mart typically consists of a single data structure (i.e., a star schema, snowflake schema or hypercube).
The data warehouse is used as a back-end data store that allow data marts or cubes to be redesigned or replaced to meet changing business requirements or focus. Data marts and/or cubes can be completely regenerated from the detailed level information contained in the data warehouse. The data marts or cubes can focus on optimizing performance (limiting data content, aggregation, deriving data, categorizing data, database fragmentation) for end-user analysis without having to be coupled to back-end ETL processing requirements
Data warehouse is a logical concept that houses the atomic data (and some aggregated/summarized data) for strategic analysis. Data marts are fed from the data warehouse with a subset (and aggregated/summarized) of the data warehouse data for performance and getting the data closer to the user.
Data marts are two-tier warehouses, consisting of operational source systems and the data mart that can quickly be tailored to individual user needs. Vendors often offer data mart suites as integrated, packaged offerings that include data access middleware for extracting data from heterogeneous data sources, a warehouse RDBMS for the consolidation of the extracted volumes and a query tool to turn that data into information. Most also provide a GUI control point from which an administrator or technical business analyst can define and manage data source extractions, transformations and enrichments.
The enterprise data warehouse model is a three-tier model that includes your data sources, a single central data warehouse, which you refereed to as a data warehouse, and one or more departmental data marts. The central data warehouse is the heart of this model. It is the point at which the data model and process models merge, transforming raw transaction data into useful information. If all data marts are sourced from this consolidation point, you insure that they all receive the same integrated, time consistent, cleansed and reconciled data. This central data warehouse usually consists of normalized data structures or tables and is the point at which the temporal element is first introduced to transactional data. Data here is still generic and must be of global interest as all of the very specific user departmental data marts are sourced from it.
In the enterprise data warehouse model, data will move from the central data warehouse to the individual data marts where it is optimized for the needs of specific users and the tools that will be used for analysis.
|