SAP BW Business Warehouse - ETL Fundamentals
Fundamentals of ETL Service Architecture
A D V E R T I S E M E N T
ETL service comprises of two parts: Staging engine and
Storage Service. Staging engine manages staging process for all data received
from several source systems. It interfaces with the AWB scheduler and monitor
for scheduling and monitoring data load processes. However, Storage Service
manages and provides access to data targets in SAP BW and the aggregates that
are stored in relational and multidimensional database management systems.
It is true, however, that the extraction technology provided as an integral part
of SAP BW is restricted to database management systems supported by mySAP
technology and that it does not allow extracting data from other database
systems like IBM IMS and Sybase. It also does not support proprietary file
formats such as dBase file formats, Microsoft Access file formats, Microsoft
Excel file formats, and others. On the other hand, the ETL services layer of SAP
BW provides all the functionality required to load data from non-SAP systems in
exactly the same way as it does for data from SAP systems. SAP BW does not in
fact distinguish between different types of source systems after data has
arrived in the staging area. The ETL services layer provides open interfaces for
loading non-SAP data.
Extraction at Service Levels
SAP BW can be integrated with other SAP components based on application
programming interface (API) service. It provides a framework to enable
comprehensive data replication based on data extractors that encapsulate the
application logic. Data Extractor fills the extract structure of data source
with a data from data source and offers sophisticated handling of changes. In
addition to supporting extractors, the service APIs also enable online access
via RemoteCube technology and flexible staging for hierarchies. On the other
hand SAP provides an open interface called Staging Business Application
Programming Interface (BAPI) to extract data from non-SAP sources. BAPI serves
the purpose of connecting third- party ETL tools to SAP BW and provides access
to SAP BW objects which facilitates use of customer extraction routines. Data
can be extracted at the database level by using: DB connect, flat files and XML.
DB connect facilitates extraction directly from DBMS. In this the metadata files
are loaded by replicating metadata tables and views into the metadatory
repository of SAP BW. Data can also be uploaded from flat files by creating
routines for extraction of data and XML files can be extracted through XML via
Administrator Workbench in SAP BW.
Components of ETL Services at Database of File Level
Operational Data Store: It stores detailed data and supports tactical,
day-to-day decision making. A SAP view ODS as a near real-time informational
environment that supports operational reporting by interacting with existing
transactional systems, data warehouses, or analytical applications. SAP BW
allows flexible access to data in the ODS, the data warehouse, and the
multidimensional models.
Data Marts: A data mart provides the data needed by a decentralized function,
department, or business area. You need to weight the pros and cons before
developing a data mart. For example, a data mart can be implemented faster and
cheaper than a data warehouse, sometimes costing 80% less than a full data
warehouse. But as data marts proliferate, the cost advantages can disappear. The
IT organization must maintain the individual data marts and the multitude of ETL
and warehouse management processes that go with them. Multiple data marts can
complicate data integration efforts, increase the amount of inconsistent data,
require more business rules, and create the data stovepipes that data
warehousing strives to eliminate.
Interfaces: The data mart interface enables users to transfer and update
transactional data and metadata from one SAP BW system to other SAP BW systems.
Open Hub Services: The open hub service is used to share data in SAP BW with
non-SAP data marts, analytical applications, and other applications. This
service controls data distribution and maintains data consistency across
systems. With the open hub service, actual data and the corresponding metadata
are retrieved from InfoCubes or ODS objects.
Understanding the role of storage services layers in architectural model
Master data manager: Master Data Manager generates the master data
infrastructure containing master data tables as well as master data update and
the retrieval routines. It also maintains master data and provides access to
master data for use by SAP BW reporting components and for exporting to other
data warehouse services for analysis and access services.
ODS Manager: ODS manager generates ODS data object infrastructure. It maintains
an active data table for maintaining ODS object data, a change log for every
update applied to the ODS object data as part of application process and
provides access to ODS object data for SAP BW reporting and analysis
functionality.
Archiving Manager: The Archiving Manager stores unused, dormant data in an
archive with the help of Archive Development Kit (ADK). ADK is connected to the
SAP BW via Archiving Manager. It also keeps track of relevant metadata such as
Infocubes and ODS objects which possibly will change over time.
InfoCube Manager: It serves the function of generating the InfoCube Meta tables.
It maintains InfoCube data tables and provides access to InfoCube data tables
for SAP BW reporting and analysis.
|