Data Warehouse Overview
Information Technology (IT) has historically influenced organizational performance and competitive standing. The increasing processing power and sophistication of analytical tools and techniques have put the strong foundation for the product called data warehouse. There are a number of reasons that any organization should consider a data warehouse, which can be the critical tool for maximizing the organization�s investment in the information it has collected and stored throughout the enterprise. IT managers need to understand the rationale and benefits of data warehouses because they may need to design and implement, or procure this kingpin of business intelligence.
A D V E R T I S E M E N T
The data warehouses are supposed to provide storage, functionality and responsiveness to queries beyond the capabilities of today's transaction-oriented databases. Also data warehouses are set to improve the data access performance of databases. Traditional databases balance the requirement of data access with the need to ensure integrity of data. In present day organizations, users of data are often completely removed from the data sources. Many people only need read-access to data, but still need a very rapid access to a larger volume of data than can conveniently by downloaded to the desktop. Often such data comes from multiple databases. Because many of the analyses performed are recurrent and predictable, software vendors and systems support staff have begun to design systems to support these functions. Currently there comes a necessity for providing decision makers from middle management upward with information at the correct level of detail to support decision-making. Data warehousing, online analytical processing (OLAP) and data mining provide this functionality.
Data Warehouse - An Introduction
A data warehouse is defined as a subject-oriented, integrated, nonvolatile, time-variant collection of data in support of management's decisions. More generally, data warehousing is a collection of decision support technologies, aimed at enabling the knowledge worker, such as executive, manager, and analyst, to arrive at better and faster decisions. Data warehouses provide access to data for complex analysis, knowledge discovery, and decision-making. style="mso-spacerun: They support high performance demands on an organization's data and information. It provides an enormous amount of historical and static data from three tiers:
- Relational databases
- Multidimensional OLAP applications
- Client analysis tools
Several types of applications such as online analytical processing (OLAP), decision-support systems (DSS) and data mining are being supported.
OLAP is a term used to describe the analysis of complex data from the data warehouse. OLAP is a software technology that allows users to easily and quickly analyze and view data from multiple points-of-view. OLAP provides dynamic and multi-dimensional support to executives and managers who need to understand different aspects of the data. Activities that are supported include:
- Analyzing financial trends
- Creating slices of data
- Finding new relationships among the data
- Drilling down into sales statistics
- Doing calculations through different dimensions where each category of data (that is, product, location, sales numbers, time period, etc.) is considered a dimension.
There are OLAP tools that use distributed computing capabilities for analyses that require more storage and processing power than can be economically and efficiently located on an individual desktop.
DSS support an organization's leading decision makers with higher-level data for complex and critical decisions. A DSS queries a data warehouse or an OLAP database for relevant information that can be compared in order to make a business decision and predict the impact of that decision.
Finally, data mining is being used for knowledge discovery, the process of searching data for unanticipated new knowledge.
Knowledge workers and decision makers use tools ranging from parametric queries to ad hoc queries to data mining. Thus, the access component of the data warehouse must provide support of structured queries (both parametric and ad hoc). These together make up a managed query environment.
Distinctive Characteristics of Data Warehouses
- Data warehouses are supposed to be blessed with the following unique features.
- Multidimensional conceptual view and generic dimensionality
- Unlimited dimensions and aggregation levels and unrestricted cross-dimensional operations
- Dynamic sparse matrix handling
- Client/server architecture and multi-user support
- Accessibility and transparency, intuitive data manipulation and consistent reporting performance
As data warehouses are not much particular about transaction processing, there is an increased efficiency in query processing. There are some specialized tools and techniques. They are query transformation, index intersection and union, special ROLAP (Relational OLAP), MOLAP (multidimensional OLAP), DOLAP (Database OLAP) and WOLAP (Web OLAP) functions, SQL extensions, advanced join methods, and intelligent scanning.
Traditional OLAP products are also known as multidimensional OLAP. Relational OLAP tools take data from traditional two-dimensional or relational databases and create multidimensional views upon request rather than being prepared in advance as in OLAP. ROLAP is often used on complex data with a wide number of fields, such as customer data. DOLAP is a relational database management system designed to perform OLAP calculations. WOLAP refers to OLAP data that can be reached from a Web server.
Data Warehouses are an important asset for organizations to maintain efficiency, profitability and competitive advantages. Organizations collect data through many sources - Online, Call Center, Sales Leads, Inventory Management. The data collected have degrees of value and business relevance. As data is collected, it is passed through a 'conveyor belt', call the Data Life Cycle Management. An organization's data life cycle management's policy will dictate the data warehousing design and methodology.
Overview of Data Warehousing Infrastructure
The goal of Data Warehousing is to generate front-end analytics that will support business executives and operational managers.
|