How Accurate was Bloor Research on Oracle in Memory?
Executive Summary
- Bloor Research produced a study that questioned the benefits of Oracle in memory.
- Bloor Research proposes that the maintenance overhead increases with Oracle in memory.
- This perspective is mostly missing from media coverage of in memory.
Introduction
On October 1, 2013, Bloor published the article Oracle In Memory, but Philip Howard. We will review this article for accuracy.
Our References for This Article
If you want to see our references for this article and other related Brightwork articles, see this link.
Notice of Lack of Financial Bias: We have no financial ties to SAP or any other entity mentioned in this article.
Article Quotations
The Oracle Announcement
“Oracle has announced an in-memory option for the 12c version of the Oracle Database. It’s expected to be available sometime in the early part of next year but no official release data has been announced.
Basically, the idea is that data will be stored on disk in a conventional row-format and a second copy of the data will be stored in columnar format in memory. This will mean that you won’t have to define analytic indexes for the data on disk as the columns act as self-indexing constructs. This in turn means less administration and tuning for the data on disk and you should improve analytic performance significantly (because data is in memory) and OLTP performance (because there are fewer indexes).”
Yes, that is quite right.
What is Not to Like?
“So, what’s not to like?
Well, the first thing is that all that memory is going to be relatively expensive. More importantly, you will either need to deploy a lot of memory or the DBA will need to include/exclude specified columns (which will be an option) from the table to be loaded into memory. However, this is a manual step (therefore the DBA must know in advance what queries will come into the database so they can set those columns up in memory or otherwise you default to putting the full table in memory), which means that what you gain on the non-indexing administration you lose on the memory administration. Secondly, the DBA must allocate a set amount of memory for the in-memory tables. Once that memory fills up that’s it: you can’t load any more tables into memory. And that means that you have to statically set the amount of space you need for your in-memory objects – whereas what you would really like is a dynamic way of re-allocating memory as required – which means a further administrative burden.”
All true. So it is more expensive, and it means more maintenance. And while it was SAP that primarily what drove Oracle to make this addition, the question remains how necessary it is.
The Current Data Warehouse Design
Currently, the data warehouse creates most of the analytics, particularly the complex analytics, and can use a database customized for this task. Meanwhile, the application can use a database customized for its task, be it transaction processing (such as for an ERP system) or for intensive processing (such as for a supply chain optimizer). What is the overriding need to combine these two database needs into one database?
Second, how good is the performance of such a database. An AWS blog questions this with the following comment.
“And, if you’ve mixed online transaction processing (OLTP) and analytics-style data access, moving from a one-tool-for-everything Oracle setup to using a separate warehouse for reporting and analytics can improve both your application responsiveness and your analytics capabilities. There are options to create a dedicated Postgres-XL–based warehouse or use Amazon Redshift as a powerful managed warehouse.”
What!! How is that possible? Isn’t Oracle In Memory (combined OLAP and OLTP the best??) This is the problem with copying a vendor like SAP that does not know very much about databases and places so much emphasis on marketing over reality.
The Overhead of the Design
“The other question that you might ask is how this in-memory column store relates to the hybrid columnar compression (HCC) used in Oracle Exadata?
It turns out that the compression for in-memory is different than HCC.
You can still have the tables stored on disk using HCC but when they are loaded into memory you have to decompress the data, then break the table apart and recompress it using the new in-memory compression algorithms for each column. Frankly, and to use an old English term: that sounds barmy.”
It appears that there is confusion here between in-memory column store and HCC. HCC is used by Exadata to compress old data. But it is independent of Oracle In-Memory.
This commingling of Oracle In-Memory functionality versus HCC makes it challenging to interpret what Bloor is attempting to say.
But the question that is not generally asked is how necessary any of this is.
“I have to say that while the top line story sounds good, I am less impressed when you look under the covers. Oracle does not appear to have done the in-depth re-engineering that you would really like to support this sort of feature. No doubt this will come in due course but from what we know now this is in contrast to, for example, IBM’s BLU Acceleration. In this context, IBM seems to have really gone down into the weeds of the technology to make sure not just that it works but that the different elements of BLU Acceleration complement one another and do not take away with one hand what they give with the other.”
This is an excellent point brought up by Philip, and it is the first time I can recall anyone bringing up this issue. But there is not enough evidence in this quotation for me to evaluate if this is true.
But something is gnawing at me that relates to this topic.
How Necessary is a Dual Type (Row and Column Store) Database?
So if we look at what drove Oracle and IBM to add column stores to their relational/row-oriented databases, it was undoubtedly HANA.
SAP proposed that it was critical that a single database could do both OLTP and OLAP.
However, how important is this need?
Companies still tend to use row-oriented databases for most applications, and then move data over to a specialized analytics system, which have different hardware and either column oriented capabilities, or uses star schemas to speed relational database performance for analytics. SAP made this emphasis in the marketplace that OLTP and OLAP should be equally well performed by a single database, but this turns out to be challenging to do. Indeed, SAP has never been able to do it.
Oracle and IBM have more database capabilities than SAP, so naturally, they have gotten closer. But if we step back for a moment, I question SAP’s proposal that this is, in fact, all that necessary. It seems we are driving database vendors to push their databases in this direction without asking the question of how vital it is in the first place.
Conclusion
There are multiple dimensions to the questions of column stores for row-oriented databases. This is a trend started by SAP and then marketed to the hilt with HANA. But there is little evidence this functionality is particularly necessary.
Philip Henry does an outstanding job of surfacing some issues related to complexity and maintenance that are entirely original. However, some technical observations are hindered by comingling HCC with Oracle In-Memory as well as requiring a fuller description of how Oracle optimizes memory during queries.
This article receives a score of 8 out of 10 on the Brightwork Accuracy Scale.