|
Blogs
So here you get it.While Oracle DBAs are used to be exposed to features like segment fragmentation (that is, although data is removed from tables/indexes these segments still allocate the same amount of space which won't be automatically released to the freespace) MaxDB users don't have to think about this. In fact, MaxDB users cannot do a single thing to change the way the data is stored. Or could they? They could. Just by sorting the data before inserting them into the table, quite some space can be saved. Don't believe me? Behold and believe!I've created two sets of identical data. One of them will get loaded with data in a sorted order (SORTLOADED) the other one will get loaded in a random order (UNSORTLOADED). Anyhow, both tables contain the very same data afterwards. The table structure is simple and the same for both tables:
Next I generate and load the test data. Now this is actually a part that is not that easy with MaxDB. In Oracle you can do things like "CREATE TABLE AS SELECT ... ORDER BY ..." and thereby create arbitrarily sorted data sets. The unequal twin tablesSo, now each table contains the exact very same rows, right? Of course we can doublecheck this:
Now let's check how equal the two tables really are in terms of storage. Let's use what we've learned in "MaxDB space oddities" and run:
Hmm... the unsorted load takes eight pages more in this example, that's 16% more space usage. What is going on here?Let's check the storage of these tables a bit more detailed. Therefore it's better to use the ‘tablestoragedetails' resp. ‘indexstoragedetails' instead as these both tables deliver their data based on samples. For the easier comparison I reformatted the result a bit:
Ok, so what do we see here? Obviously the difference in the total number of pages is due to the difference in the number of 'Leaf pages' - those pages that contain the records completely. But why is it like this?To answer this question, we've to remember how B*-Trees are build up. I'm not going to reinterate on B*-Trees in general, but just like to point out the important facts about them, that are necessary here to explain the effect:
To fullfill these requirements, an insert to a B*-Tree usually is done like this:
Due to the mentioned features of B*-Trees there is one case where no rebalancing is done, because it's never necessary: inserts in ascending key order. When the key value increases with each insert, than a rebalancing with the right neighbour page is not possible - there simply is no further right neighbour. And since the data is correctly sorted anyhow, all what's left to do is to update the index pages. That way a lot of time can be saved and the pages get filled up much denser than it would be the case with random inserts. This feature is also present in other DBMS, e.g. Oracle. For Oracle's B*-Tree-Indexes there are two types of page/block-splits: the ‘usual' 50:50 block split, which is just the standard index balancing and the 90:10 block split. By the way: inserting the sorted data in descending order also leads to the 'bigger' table, since there is no special handling of sequential descending inserts. What's this knowledge useful for?If you ever did a system copy with R3Load and checked the table sizes in the target system against that in the source system, then you might have asked yourself: did I loose any data? The tables are smaller than in the source system! If you then are aware of the fact that R3load reads and writes it's data sorted by the primary key, then you now know the answer: all data is there, just saved in a denser way. It's also worth noting that such dense tables need more balancing operations when you start inserting data into the middle of the table. Since there is not much freespace in the pages that could be used for the rebalancing, new pages need be added to the B*-Tree and that causes the rebalancing of whole subtrees. For that reason I would not say, that having such dense B*-Trees is actually something to wish for in a OLTP system. Hope you liked this blog as the first "Space oddities" blog from myself. Lars p.s. While researching for this I came across another blog entry called "A space oddity": http://richardfoote.wordpress.com/2008/01/18/introduction-to-reverse-key-indexes-part-iii-a-space-oddity/ by Richard Foote, who seems to be really deep into Oracle index internals.
Lars Breddemann
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||