|
Blogs
This blog looks at the situation of a customer who migrated his hand-crafted Oracle data warehouse to SAP BW. It is a tangible, real-world case that contradicts many of the alleged pros and cons of the differing approaches to data warehousing. Beside TCO, the new SQL interface on SAP BW allows even for a surprising 1:1 comparison regarding query performance. There is a long running, almost religious debate on how to build up a data warehouse (DW), either using a DW management application – like SAP BW – or using a set of loosely coupled tools and manually crafting the management of that DW with an RDBMS as the central infrastructure piece. The latter often dilutes the difference between an RDBMS software suitable for high data volumes and a DW. Examples are the traditional RDBMS vendors like Oracle, Microsoft, IBM and Sybase but also the upcoming and/or specialized vendors like Teradata, Greenplum, Netezza or Vertica. Fundamentally, they offer an RDBMS. In order to build up a DW this RDBMS needs to be supplemented by a management software that is based on ETL and other tools (e.g. for data enrichment, data quality etc.). That software manages challenges like
In summary: the DW is the combination of an RDBMS with a DW management software. The latter can be created proprietarily (via consultant projects and/or via evolution over time) or a standard package such as SAP BW can be used. It is obvious that non-standard, hand-crafted approaches are likely to imply a high TCO as every change (to a data source or to a model) requires expert programmers to adjust the software. On the bonus side, it is often argued that a manually managed DW can be tailored to the individual needs and that there is so many more options that one can use, especially for query performance tuning. One of SAP's customers ran such a hand-crafted DW based on an Oracle RDBMS. Many of source systems were operational SAP systems. Extraction was proprietary. The system started small and grew over the years and with that grew the management complexity and consequently the TCO. In order to tackle the latter, BW was considered, particularly for the DW management advantages. End users accessed the RDBMS via Business Objects's WebI using SQL as the query language. With the switch to BW the customer wanted to keep WebI as the frontend tool for his end users. Considering the SQL interface on SAP BW there was suddenly a 1:1 comparison between (productive) queries running on an Oracle RDBMS versus the same queries running on a BW system that leveraged BI Accelerator (BIA aka BWA). The result can be seen in the chart below; each query is represented by an individual color. While the overall superior performance of BW and BWA is evident in the chart, it is important to note that fast queries under Oracle remain fast under BW/BWA and that slow queries under Oracle become fast under BW/BWA. The overall variance of query runtimes is minimal under BW/BWA leading to a stable performance pattern. Finally, the history of the compared systems suggests that the Oracle DW has received many many hours of manual tuning over the years while the BW/BWA system has hardly received any tuning. So, even here, the TCO aspect is apparent. This is just an instance that provides the rare opportunity of directly comparing two widely differing approaches. In my opinion, this example also questions the value and the significance of assessing data warehouses by (only) its underlying RDBMS. See Gartner's magic quadrant on DW DBMS as an instance of such an analysis. Here, for this customer, the RDBMS does not make the difference: his BW also runs on top of an Oracle RDBMS. Thomas Zurek
| |||||||||||||||||||||||