How to Understand Why In Memory Computing is a Myth
Executive Summary
- AWS covers HANA’s in memory nature. Placing a database 100% put into memory is not a good thing.
- We cover the long history of database memory optimization.
Video Introduction: In Memory Computing
Text Introduction (Skip if You Watched the Video)
SAP has been one of the major proponents of something called “in memory computing.” Hasso Plattner has written four books on the topic, making grandiose claims about what HANA’s in memory would achieve for customers that purchased HANA. The problem is that there is not a database that loads without memory. This means getting into how in memory for databases works. You will learn how Hasso and SAP have been misrepresenting in memory to low information corporate buyers.
Our References for This Article
If you want to see our references for this article and other related Brightwork articles, see this link.
Notice of Lack of Financial Bias: We have no financial ties to SAP or any other entity mentioned in this article.
Hasso Plattner has been pushing the importance of in-memory computing for many years. Hasso Plattner’s books aren’t books in the traditional sense, but they are sales material for SAP. The books we have read by Hasso Plattner uniformly contain exaggerations as to the benefits one can expect from “in memory computing.”
If you read any of Hasso’s books or read his interviews, he is continually jumping from one topic to the next. People that are programmers are aware of a series of sequential and unending goto statements. But after two days of running an infinite loop, eventually, people will figure out, “hey, these goto statements are not doing anything.” Programmatic goto statements run too quickly to be useful in tricking people, but it appears that evidence-free assertions can last a very long time.
There have been some inaccuracies concerning the specific topic of memory management with HANA.
In an article titled SAP’s HANA Deployment Leapfrog’s Oracle, IBM and Microsoft published in Readwrite. The following quote reiterates this popularity.
In-Memory Databases
Many companies today offer in-memory databases for a variety of tasks. The databases are much faster than traditional technology because all data is stored in system memory where it can be accessed quickly. Standard relational databases write and read to disks, which is a much slower process.
This may be rock ReadWrite’s world, but other databases use memory as well. They load tables into memory that are needed by the application. There is a debate about whether one should load all tables into memory, which is how SAP does it. However, the benefits of doing this are not demonstrated in any benchmark covered in the article What is the Actual Performance of SAP HANA?SAP has made so many statements about HANA’s benefits, but in their entirety, they are nothing more than unverifiable anecdotes provided to them about mostly anonymous customers.
In the absence of evidence, SAP’s proposal that they have the best way of dealing with memory and databases should be considered conjecture. But ReadWrite seems to treat it as if it is a natural law, as established as gravity.
What is Non In Memory Computing?
So all computing occurs in memory. No form of computing is performed without memory because the results would be unacceptable. Computing has been using more and more memory as anyone who purchases a computer can see for themselves. While at one time a personal computer might sell with 4 GB of memory (or RAM), 16 GB is now quite common on new computers.
The Problem with the Term In Memory Computing
SAP took a shortcut when they used the phrase “in memory” computing. The computer I am typing on has loaded the program into memory. So the term “in-memory computing” is meaningless.
Instead, what makes HANA different is it requires more of the database to be loaded into memory. And HANA is the only database I cover that works that way. With this in mind, the term should have been
“more database in memory computing.”
**There is a debate as to how many tables are loaded into memory. Not the large tables and not the column-oriented tables. This is the opposite of what SAP has said about HANA. The reason for this debate is SAP has provided contradictory information on this topic.
That is accurate. SAP’s term may roll off the tongue better, but it has the unfortunate consequence of being inaccurate.
And one can’t argue that it is correct.
Here is a quote from AWS’s guide on SAP HANA, which tends to be more accurate than anything SAP says about HANA.
“Storage Configuration for SAP HANA: SAP HANA stores and processes all or most of its data in memory, and provides protection against data loss by saving the data in persistent storage locations. To achieve optimal performance, the storage solution used for SAP HANA data and log volumes should meet SAP’s storage KPI.”
However, interestingly, the following statement by AWS on HANA’s sizing is incorrect.
“Before you begin deployment, please consult the SAP documentation listed in this section to determine memory sizing for your needs. This evaluation will help you choose Amazon EC2 instances during deployment. (Note that the links in this section require SAP support portal credentials.)”
It is likely, not feasible for AWS to observe that SAP’s sizing documentation will cause the customer to be undersize the database. The customer will purchase HANA licenses on pretenses and then have to go back to buy more HANA licenses after the decision to go with HANA has already been made.
Bullet Based Guns?
Calling HANA “in-memory computing” is the same as saying “bullet based shooting” when discussing firearms.
Let us ask the question: How would one shoot a firearm without using a bullet?
If someone were to say their gun was better than your gun (which in essence SAP does regarding its in-memory computing) and the reason they give is that they used “with bullet shooting technology,” you would be justified in asking what they are smoking. A gun is bullet based technology.
How to Use a Term to Create Confusion Automatically
This has also led to a great deal of confusion about how computers use memory among those who don’t spend their days focusing on these issues. And this is not exclusive to SAP. Oracle now uses the term in-memory computing, as do many IT entities. Oracle references the term also, as can be seen in the following screenshot taken from their website.
Is 100% of the database Placed into Memory a Good Thing?
However, the question is whether it is a good or necessary thing. And it isn’t easy to see how it is.
It means that with S/4HANA, even though only a small fraction of the tables are part of a query or transaction, the entire database of tables is in memory at all times.
Now, let us consider the implications of what this means for a moment. Just think for a moment how many tables SAP’s applications have and how many are in use at any one time.
Why do tables not involved in the present activity, even tables that are very rarely accessed, need to be in memory at all times?
Oracle’s Explanation on This
In August 2019, Oracle published the Oracle for SAP Database Update document. In this document, Oracle made the following statement about HANA versus Oracle.
Oracle Database 12c comes with a Database In-Memory option, however it is not an in-memory database. Support-ers of the in-memory database approach believe that a database should not be stored on disk, but (completely) in memory, and that all data should be stored in columnar format. It is easy to see that for several reasons (among them data persistency and data manipulation via OLTP applications) a pure in-memory database in this sense is not possible. Therefore, components and features not compatible with the original concept have silently been added to in-memory databases such as HANA.
Here Oracle is calling out SAP for lying. Furthermore, we agree with this. SAP’s proposal about placing all data into memory was always based upon ignorance and ignorance on the part of primarily Hasso Plattner.
If SAP had followed Oracle’s design approach, companies would not have to perform extensive code remediation — as we covered in the article SAP’s Advice on S/4HANA Code Remediation.
The Long History of Database Memory Optimization
People should be aware that IBM, Oracle, and Microsoft all have specialists focusing on memory optimization.
Microsoft has documents on this topic at this link.
Outsystems, a PaaS development environment that connects exclusively to SQL Server, has its page on memory optimization to the database, which you can see at this link.
The specialists who work in this area figure out how to program the database to have the right table in memory to meet the system’s demands, and there has been quite a lot of work in this area for quite a long time. Outside of SAP, there is little dispute that this is the logical way to design the relationship between the database and the hardware’s memory.
Conclusion
In summary, if a person says “in-memory computing,” the response should be “can we be more specific.” Clear thinking requires the use of accurate terms as a logical beginning point.
SAP’s assertion the entire database must be loaded into memory is unproven. A statement cannot be accepted if it both has no meaning and if what it actually means (as in the entire database in memory) is unproven.