How Accurate Was John Appleby on HANA Not Working Fast for OLTP?
Executive Summary
- John Appleby made bold predictions on HANA.
- We review how accurate he was in his article on HANA Not Working Fast for OLTP.
Introduction
John Appleby’s article on the SAP HANA blog was titled, Who says HANA doesn’t work fast for OLTP?, and was published on November 18, 2014. We review this article for accuracy.
Our References for This Article
If you want to see our references for this article and other related Brightwork articles, see this link.
Notice of Lack of Financial Bias: We have no financial ties to SAP or any other entity mentioned in this article.
The Quotations
HANA Performance with OLTP?
“Sometimes people think that because HANA is a columnar database, it doesn’t run fast for simple OLTP operations. I was just looking at a performance problem with class /IWBEP/CL_MGW_ABS_MODEL, method GET_LAST_MODIFIED.
This had some screwy ABAP, which is incidentally fixed in SAP Note 2023100 (thanks Oliver Rogers for finding the note), and it generated the following SQL:
SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c
FROM “REPOSRC”
WHERE “PROGNAME” LIKE ‘CL_ABAP_COMP_CLASS%’ AND “R3STATE” = ‘A’
ORDER BY “CDAT” ,”UDAT” ,“UTIME”
That SQL is pretty nasty, because it does a wildcard search on a big table. On the non-HANA system it was running in 20 seconds. I did the root cause analysis in the database and found that it was searching the primary clustered index, which was 98% fragmented.
Obviously I rebuilt the index – these are the results.
CPU time = 2750 ms, elapsed time = 2746 ms.
CPU time = 2594 ms, elapsed time = 2605 ms.
CPU time = 2750 ms, elapsed time = 2764 ms.
I realized at this point this was some bad coding, so I found the fix thanks to Oli and we put the change in. That fixed the performance problem.
But then I thought… what happens if you run this bad query on a HANA system? This is just what custom code looks like a lot of the time…
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” LIKE …’
successfully executed in 12 ms 414 µs (server processing time: 11 ms 613 µs)
Fetched 8 row(s) in 0 ms 68 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” LIKE …’
successfully executed in 9 ms 778 µs (server processing time: 9 ms 136 µs)
Fetched 8 row(s) in 0 ms 64 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” LIKE …’
successfully executed in 12 ms 677 µs (server processing time: 11 ms 830 µs)
Fetched 8 row(s) in 0 ms 56 µs (server processing time: 0 ms 0 µs)
So anyDB is averaging 2705ms, and HANA is averaging 10.86ms, an average speedup of 249x.
You may be saying… OK well that’s for poorly written SQL – what about when it was optimized. Sure, let’s test in that scenario. Here’s the SQL:
SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c
FROM “REPOSRC”
WHERE
“PROGNAME” IN (‘CL_ABAP_COMP_CLASS============CCDEF’, ‘CL_ABAP_COMP_CLASS============CCIMP’, ‘CL_ABAP_COMP_CLASS============CCMAC’, ‘CL_ABAP_COMP_CLASS============CI’, ‘CL_ABAP_COMP_CLASS============CO’, ‘CL_ABAP_COMP_CLASS============CP’, ‘CL_ABAP_COMP_CLASS============CT’, ‘CL_ABAP_COMP_CLASS============CU’)
AND “R3STATE” = ‘A’
ORDER BY “CDAT”, “UDAT”, “UTIME”
So ran it on anyDB, I couldn’t get accurate results from the SQL console so I had to use the ABAP trace to get the numbers. They were 5.504ms, 1.484ms, 4.605ms for an average of 3.86ms. Let’s see how HANA compares.
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” IN …’
successfully executed in 1 ms 977 µs (server processing time: 1 ms 156 µs)
Fetched 8 row(s) in 0 ms 63 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” IN …’
successfully executed in 1 ms 946 µs (server processing time: 1 ms 250 µs)
Fetched 8 row(s) in 0 ms 60 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” IN …’
successfully executed in 2 ms 230 µs (server processing time: 1 ms 127 µs)
Fetched 8 row(s) in 0 ms 59 µs (server processing time: 0 ms 0 µs)
With HANA then, we get an average of 1.18ms for an average speedup of 3.27x.“
Yes, HANA does not run fast for OLTP. This time is well established in 2019, as we covered in HANA as a Mismatch for S/4HANA and ERP.
This overall explanation by Appleby for this test is illogical. The detail that he did does not answer the question that he poses with his article’s title. And this is because the framework of the test is entirely wrong. This particular example may be an example of a poorly designed query, but HANA’s long-term transaction processing issues cannot be entirely due to bad queries. And secondly, this test has nothing to do with this, as is pointed out by Ahmed Azmi.
“Query speed is NOT OLTP. OLTP means online TRANSACTION processing. You need to at least do a single transaction and perform an insert/update an a commit/rollback. This tests how your DB can do things like locking and redo log management.”
So why is Appleby using a query as an example of transaction processing?? A query is OLAP or analytics, so is Appleby unaware of the difference between OLAP and OLTP because this article indicates that he is unaware of the distinction.
Shouldn’t the article’s title be how improving a query or rebuilding an index can improve query performance?
What on God’s Green Earth is Appleby’s Understanding of Testing?
“For poorly constructed OLTP queries at the database level, we can get enormous benefits of running HANA – up to 250x or more. With optimized SQL that hits database indexes on anyDB, that drops to around 3.27x, but SAP have only ever claimed a 2-3x increase of running ERP on HANA for transactional workloads.
And remember if you move to the sERP suite, you’ll see another 2-3x because the data structures are simpler. That’s going to mean response times of 5-10x faster than on anyDB.
I don’t know about, you, but that feels significant to me.
Yes, I know I didn’t do concurrency, inserts, updates and all that jazz. This was just a quick test run with 15 minutes of spare time. Hope it is informative. It’s also worth noting that with program changes, I was in this case able to get acceptable performance using anyDB for our timesheet app. The only area where performance is a problem is for the WBS element search, which is a wildcard search again.
For those searches, HANA rocks. For customer, product – anything with a free text search, HANA is going to kill anyDB.“
Ok, where to begin with this quote.
First, this is a constant feature of how Appleby does tests. He has a concise attention span and writes up results as if he needs to be medicated for attention deficit disorder. No person with this type of attention span or lack of attention to detail can ever work in testing or communicating testing results.
Something else apparent from reading Appleby’s tests or comparisons is that Appleby will always have an excuse for why he was unable to perform any thorough work. For example, in his “TCO” article that we covered in How Accurate Was John Appleby on HANA TCO for Cloud vs. On-Premises?
He stated…
“Its an extremely simplistic TCO and doesn’t take into account the operational costs of running HANA”
Or the consulting costs, for that matter!
It isn’t a TCO at all, which means the article, which is titled “HANA TCO for Cloud vs. On Premises,” is completely misnamed. The title should have been “The License and Cloud Costs of HANA,” as he made zero attempts to calculate TCO. Appleby seems to think that he can get by without putting work into his analysis, and further that, he can name analysis things that they are not. If I state I intend to estimate the earth’s weight and then stop at estimating the weight of the oceans, I cannot say that my estimate was “quick and dirty” because I did not even attempt to complete the job.
In this article, he states…
“This was just a quick test run with 15 minutes of spare time.”
Why is Appleby so pressed for time?
Appleby had people working for him at this time at Bluefin Solutions. There were plenty of more junior Bluefin consultants that he could have directed to perform this test, and he was one of the best-known sources on HANA at this time. So why the unwillingness to apply either his mental effort and time or that of any of the other Bluefin resources to this task?
Appleby does not seem to realize that it is not setting up a logical test. This test only shows how a query can be sped. It does not answer the question of why HANA is slow in transaction processing overall.
Notice in the next quote how he quickly inserts an evidence-free claim or hypothesis into the test analysis.
Why is the following sentence true?
“And remember if you move to the sERP suite, you’ll see another 2-3x because the data structures are simpler. That’s going to mean response times of 5-10x faster than on anyDB.”
What? What does that have to do with the test? The comment about simpler data structures is a hypothesis that has proven not to be true, as we covered in the article How Accurate Was SAP About S/4HANA and a Simplified Data Model?
Data points from HANA projects have shown that these estimates are false and that HANA underperforms competing databases. Secondly, this hypothesis should not be included in this “test.”
Then he states the following.
“I don’t know about, you, but that feels significant to me.”
5-10x faster is significant…..but not demonstrated. This would be like saying that the moon is made of cheese, and
“I don’t know about you but that sounds like it would taste good to me.”
Yes, but the first step is to prove the claim, not to comment on the claim’s benefit. Or in Appleby’s world, would we simply accept the claim and then move on to discuss how we will eat all of that cheese?
What is the most important question when the claim is made the moon is made of cheese? Is it…
- Think about the fact that that is a lot of cheese and that it probably would taste terrific?
- Question whether the moon is indeed made of cheese?
Then Appleby goes on to make another claim.
“For those searches, HANA rocks. For customer, product – anything with a free text search, HANA is going to kill anyDB.”
Appleby is back to what he does best, making claims. He is certainly not good at proving claims. Even his articles that propose to establish a claim do nothing of the sort. And SAP has produced no comparative benchmarks with HANA to back up their claims or Appelby’s claims, as we covered in the article The Hidden Issue with the SD HANA Benchmark.
Comment from Shkelzen Hasanaj: The Issue of Comparable Hardware
“I understand that claims about speed increase are tested and I am not saying that it is not true, but I find your comparison in this case a little bit incomplete. From what information you provide, only the query is the same. We have no information about HW specifications etc. Moreover, I think it should be compared to its in-memory counterparts, not just any DB.
Yeah, HANA rocks but still, from this information, you can not pretend anything at all.”
I believe Shkelzen meant to say, “you cannot pretend to know anything at all.” Which is true.
John Appleby’s Response
“This is a little ditty and not a large-scale comparison. I’d sure like to do such a thing, but only had 15 minutes to spare and thought the results were interesting enough to write about.
The hardware was certainly different. The MSSQL has a much faster CPU (Ivy Bridge vs Westmere) but HANA has a lot more cores and RAM.
250x is still significant whatever the hardware.”
And Appleby pivots from the question by again stating the test is not thorough. And is that an understatement? Appleby only has 15 minutes to test his hypothesis? Interesting. If Appleby only had so little time to prove his theory, perhaps he should have stopped making so many of them. In all the time I have spent reviewing research, I can’t ever recall a “15 minute time limitation” being used as an excuse to divert from the question on the study.
Appleby’s sloppiness is inherent in how he brushes aside the hardware issue. This illustrates how Appleby’s imprecise thinking and how little he understands or seems to care about testing.
Comment from Bill Ramos: Where Are the Benchmarks?
“If SAP HANA is so fast with OLTP, why has SAP yet to publish benchmark results for it’s own SD benchmark – http://global.sap.com/solutions/benchmark/sd2tier.epx? It’s getting old to hear that SAP HANA is “different” and enables a completely different way to thing about databases and that the old benchmarks no longer matter. I’d love to see a heads up OLTP type benchmark run where you can really contrast OLTP performance in a reproducible environment.
If you could provide the data and code environment, I’d love to run SAP HANA against say SQL Server 2014 In-Memory OLTP on an identical AWS configuration. 🙂
Also, SAP HANA really needs to take a look at optimizing stored procedure execution. During some testing I did on build 76, stored procedures executed almost twice as slow as running the query that was part of the stored procedure. I had to get rid of all my stored procedures to get my app to run at top speed.
As John mentioned below, publishing benchmarks – even in a blog like this could land someone it hot water, but you can bet the vendors are always internally comparing each other. They just can’t publish results. This is why you have cross vendor “standards” like TPC and the SAP SD benchmark which require careful reviews of the results. Just as John suggests, SAP isn’t ready to throw HANA into the fray. At least SAP Sybase ASE shows up to the table.”
Bill is correct. SAP never publishing an OLTP benchmark when it introduced HANA. We hypothesize this was because publishing it would have illuminated the fact that HANA is deficient in transaction processing, which we covered in the article What is the Actual HANA Performance?
Appleby’s Response
“You know, you get no-nonsense from me Bill.
I’ve not run the benchmark but I believe it’s because:
1) SD doesn’t run well on HANA
2) SD doesn’t accurately represent how customers actually use systems in 2014
3) HANA does run well for how customers really use systems in 2014
SAP are in the process of creating a new benchmark which I understand will include mixed-workload OLTAP queries.
There is a BW-EML benchmark which does mimic how customers use OLAP systems in 2014. Unfortunately it has some design flaws which mean it can be gamed (for instance you can run loads on one area and queries on the other, rather than forcing both at the same time).
Benchmarking is much better done on real customer data with real scenarios. I have done a lot of it, but none of it is possible to share in the public domain.
Also, the license agreements with Microsoft/IBM/Oracle/SAP specifically don’t allow benchmarking. That’s why I referred to the database above as anyDB, so it can’t be recognized.“
What? So Appleby proposes that SD transaction processing does not run well on HANA and that, furthermore, SAP suppresses benchmarks that don’t make HANA look good? I mean, we knew this, but Appleby is admitting this? Also, why would this be the case? HANA, according to Bill McDermott, runs 100,000 times faster than any other competing technology. It should require no benchmarks to be hidden. Recall that S/4HANA was restricted to HANA under the logic that..
“no other database could keep up with SAP’s innovation.”
What is this comment that HANA does not run SD well, but does run how for customers run SD? At this point, S/4HANA has yet to be released, so 100% of customers in 2014 for SAP were running SD from ECC!!!!!
The following questions naturally arise.
- Why does SAP need to create a new benchmark?
- Is this because HANA does so poorly on the previous benchmark?
- Is that how SAP does things?
- It rigs benchmarks if its products cannot perform well in them?
Conclusion
This article scores a 0 out of 10 for accuracy.
The article covers what purports to be a test, but does not test HANA’s transaction processing performance. Appleby can’t seem to put any effort into testing, with every test being “quick and dirty” or otherwise lacking and incomplete.
Appleby then segues to intermingling more exaggerated claims about HANA in transaction processing. In the comments section, he admits that SD (a transaction processing module) probably does not run well on HANA! He then tops this off by accusing SAP of what we already knew, that SAP rigs benchmarks to benefit their products over competitors, and then simply does not release benchmarks if the benchmark makes SAP’s products look bad.
What? What?
This has to be categorized as the most insane article we have ever critiqued. This is the brain that was being used to advise companies on HANA?