This blog post will guide you through one of the very important performance features in SAP BW – partitioning and the new semantically partitioned object (SPO). You will read in details about the different types, benefits and reasons why to implement partitioning.
Nowadays data warehouse systems contain already terabytes of data and the number grows with every single day. However, the demand coming from the business users is to always have ultra-fast reporting and BI applications. Therefore, performance in both the loading and the subsequent reporting is becoming more and more crucial. Good consultants always pay attention to performance factors –a proper model for the dataflow, good multi-dimensional modeling of the data mart layer, parallelism in extraction and loading, etc. However, it seems that quite often people disregard the importance of partitioning and SAP BW gives quite some options in this aspect.
So, let’s talk about partitioning in SAP BW and its benefits for the technical team and the business users. First of all we have to make a clear statement that there are two different partitioning types available – logical and database.
Database partitioning is used to split the total dataset for an InfoProvider into several, smaller, physically independent chunks, which are without redundancies. This process drastically improves performance when users analyze the data at query runtime and delete data from the InfoProvider. In other words when you partition a database table, the system will find faster the requested information, because it will filter out all the irrelevant partitions and thus reduce the data volume to be read.
For some cases the system is designed to automatically generate partitions. These are PSA tables and the F-fact tables of InfoCubes, which are partitioned by request ID. For splitting the E-fact table of an InfoCube in partitions, the developer has to specify how the data should be partitioned. The option that SAP BW provides is to do the split based on the time characteristics - 0CALMONTH and 0FISCPER. One of these of course has to exist in the InfoProvider, so that you can partition based on it.
In BW 7.3 SAP introduced database partitioning on data store objects as well. The principle is the same and the partitioning relevant characteristics should exist in the key and refer to the 0CALMONTH or 0FISCPER InfoObjects. Be sure to have your DSO partitioned before it is loaded with data.
This type of partitioning refers to the creation of a flexible data flow model, where the data is logically spread among several InfoProviders, which have the same structure. This happens usually for one of the following reasons:
- The InfoProvider’s data is expected to grow so much, that the InfoProvider becomes very slow and not manageable.
- There are several identical, but independent processes on the source system side. This is often a case for multinational companies, which have operations in several countries. In general they follow one and the same process and share one system, but they generate independent master and transactional data.
In both cases, the InfoProvider is split to several smaller BW objects. In comparison to database partitioning, here you do not have any limitation for the characteristic by which you want to split. A typical example for such characteristics would be 0CALYEAR or 0COMPCODE. Implementing logical partitioning requires more initial effort in the development phase, but for large scale implementations this for sure pays back in the long term for two reasons. First, the reports and applications become significantly faster because the system can execute several database requests simultaneously. Second, the model is much easier and nicer for administration purposes.
Usually partitioning is done on the propagator layer and the DataMart layer, but there is no strict rule and the data flow model can be designed as flexible as necessary. The only important thing is to really keep the transformation logic and the Infoproviders’ structures the same.
In SAP BW 7.3 SAP surprised the developers with a very nice InfoProvider, which shortens the initial implementation effort and makes the post maintenance easier. It is called Semantically Partitioned Object (SPO) and is an InfoProvider that consists of several InfoCubes or DataStore objects with the same structure. Semantic partitioning in the context of SPO is a property of the InfoProvider, which is specified upon creation. As logic, the SPO is exactly the same as the old sematical partitioning that we discussed earlier with the only difference that it does not allow you to cover separately exception cases in the loading process. This means that your transformation should have a unified logic. SPO-s are also quite flexible, because you are not restricted to use the whole object for further data loading or for analysis. For example, you can create a MultiProvider on top of only several partitions or load data to an Infocube from only one particular partition of the SPO.
Now, as we have discussed the features around partitioning and the new functionality in that area in BW 7.3, we can clearly say that SAP has done a great job to ensure a well performing data warehousing system. The only thing that we have to do, is to use this in implementations and guarantee fast BI solutions, which require minimal administration efforts.