Monday, November 2, 2009

New Evolutions

Many vendors in the traditional data consolidation market are positioning their products as either Extract-Transform-Load (ETL), Extract-Load-Transform (ELT), or maybe even Transform-Extract-Load (TEL) tools. Each vendor naturally touts the strengths of their adopted approach, and highlights the weaknesses inherent in those of their competitors.

So, which approach is best? The truth is that all approaches have their strengths and weaknesses, and it is likely that most organizations will find a need to use a combination of all of these techniques. Therefore, the real key to the alphabet soup of ETL vs. ELT vs. TEL, is flexibility and the ability to support the technique that best suits the job at hand. Molding a data flow that fits well into an ETL architecture into an ELT architecture, just because the tool lacks the ability to adequately support one process or the other, is a recipe for disaster.

WebSphere DataStage is an inherently flexible data consolidation tool that can natively support ETL, ELT, and TEL topologies. This article shows how the combination of WebSphere DataStage and WebSphere Federation Server can extend the alphabet soup by effectively supporting Transform-Extract-Transform-Load (T-ETL) data consolidation topologies. Within a T-ETL topology, WebSphere DataStage and WebSphere Federation Server complement each other in such a manner that significant performance benefits and CPU savings can be achieved relative to using WebSphere DataStage alone.

In this scenario, WebSphere Federation Server is able to perform processing close to the input sources so that less data is presented to the extraction stage, and less transformation needs to be done by WebSphere DataStage. This benefit is achieved because the T-ETL architecture plays exactly to the strengths of both products; WebSphere Federation Server for its cost-based optimizer and set processing efficiency in a heterogeneous environment, and WebSphere DataStage for its powerful parallel transformation and data flow engine.

The following section of this article provides a brief introduction to WebSphere Federation Server before describing the T-ETL architecture in more detail. The subsequent use-case scenarios sections detail four different cases that highlight the benefits of T-ETL. The general traits of WebSphere DataStage jobs that are likely to benefit from this architecture are then summarized.

Before Next