Wednesday, August 27, 2008

Architecture Review and Design

The Architecture is the logical and physical foundation on which the Data Warehouse will be built. The Architecture Review and Design stage, as the name implies, is both a requirements analysis and a gap analysis activity. It is important to assess what pieces of the architecture already exist in the organization (and in what form) and to assess what pieces are missing which are needed to build the complete Data Warehouse architecture.

During the Architecture Review and Design stage, the logical Data Warehouse architecture is developed. The logical architecture is a configuration map of the necessary data stores that make up the Warehouse; it includes a central Enterprise Data Store, an optional Operational Data Store, one or more (optional) individual business area Data Marts, and one or more Metadata stores. In the metadata store(s) are two different kinds of metadata that catalog reference information about the primary data.

Once the logical configuration is defined, the Data, Application, Technical and Support Architectures are designed to physically implement it. Requirements of these four architectures are carefully analyzed so that the Data Warehouse can be optimized to serve the users. Gap analysis is conducted to determine which components of each architecture already exist in the organization and can be reused, and which components must be developed (or purchased) and configured for the Data Warehouse.

The Data Architecture organizes the sources and stores of business information and defines the quality and management standards for data and metadata.

The Application Architecture is the software framework that guides the overall implementation of business functionality within the Warehouse environment; it controls the movement of data from source to user, including the functions of data extraction, data cleansing, data transformation, data loading, data refresh, and data access (reporting, querying).

The Technical Architecture provides the underlying computing infrastructure that enables the data and application architectures. It includes platform/server, network, communications and connectivity hardware/software/middleware, DBMS, client/server 2-tier vs.3-tier approach, and end-user workstation hardware/software. Technical architecture design must address the requirements of scalability, capacity and volume handling (including sizing and partitioning of tables), performance, availability, stability, chargeback, and security.

The Support Architecture includes the software components (e.g., tools and structures for backup/recovery, disaster recovery, performance monitoring, reliability/stability compliance reporting, data archiving, and version control/configuration management) and organizational functions necessary to effectively manage the technology investment.

Architecture Review and Design applies to the long-term strategy for development and refinement of the overall Data Warehouse, and is not conducted merely for a single iteration. This stage develops the blueprint of an encompassing data and technical structure, software application configuration, and organizational support structure for the Warehouse. It forms a foundation that drives the iterative Detail Design activities. Where Design tells you what to do; Architecture Review and Design tells you what pieces you need in order to do it.

The Architecture Review and Design stage can be conducted as a separate project that runs mostly in parallel with the Business Question Assessment stage. For the technical, data, application and support infrastructure that enables and supports the storage and access of information is generally independent from the business requirements of which data is needed to drive the Warehouse. However, the data architecture is dependent on receiving input from certain BQA activities (data source system identification and data modeling), so the BQA stage must conclude before the Architecture stage can conclude.

The Architecture will be developed based on the organization's long-term Data Warehouse strategy, so that future iterations of the Warehouse will have been provided for and will fit within the overall architecture.

0 comments: