Huawei & Honor's Recent Benchmarking Behaviour: A Cheating Headacheby Andrei Frumusanu & Ian Cutress on September 4, 2018 8:59 AM EST
- Posted in
- Kirin 970
The Reality of Silicon And Market Pressure
Section By Andrei Frumusanu
In a sense, the Kirin 960 and Kirin 970 have been a welcome addition to our mobile testing suite. As a result of having devices powered by the two chipsets, we have switched over to a new testing methodology where we now always publish peak and sustained performance figures alongside each other. Without the behavior of these devices, we might never have changed our methods to catch these shenanigans.
But if we’re to go back to a paragraph in the Kirin 970 SoC piece:
Indeed, the Kirin 960 and 970’s vast discrepancies between peak performance and their inability to sustain those performance was one of the key reasons why for this year I opted change our mobile GPU performance testing methodology. All reviews this year were published with peak and sustained performance figures alongside each other, trying to unveil some of the more negative aspects of sustained performance among some of today’s smartphones.
The behaviour of this year’s Kirin 970 devices is, in a sense, not surprising. Huawei & Honor's power throttling adjustments are a great positive for the actual user-experience as they solve one of the key issues I had brought up about the chips in the review: they limit phone power consumption to reasonable levels, rather than burning through power and battery capacity like crazy. This new behavior on power throttling is essentially an aftershock to the Kirin 960’s awful GPU power characteristics. Somebody smart at Huawei decided that the high power draw was indeed not good, and they introduced a new strict throttling mechanism to keep temperatures in check.
This means that when we look at the efficiency table, it makes a lot of sense. Both chips showcase instantaneous power draws way above the sustainable levels for their form-factors, which the throttling mechanism keeps in check.
Competing Against Cheaters: Two Options
While I fully support Huawei in introducing the new throttling mechanisms, the big faux-pas here was in terms of them excluding benchmark applications via a whitelist. During the Kirin 950 days when we talked to HiSilicon’s managers, we discussed GPU power as an important topic even back then. Those generation chipsets had substantially lower GPU performance compared to the competition, however the GPU power was always within the sustainable thermal envelope of the phones – around 3.5W.
Now, when we look at total system power, we see that Huawei has made improvements:
|GFXBench Manhattan 3.1 Offscreen Power Efficiency
(System Active Power)
|AnandTech||Mfc. Process||FPS||Avg. Power
|Galaxy S9+ (Snapdragon 845)||10LPP||61.16||5.01||11.99 fps/W|
|Galaxy S9 (Exynos 9810)||10LPP||46.04||4.08||11.28 fps/W|
|Galaxy S8 (Snapdragon 835)||10LPE||38.90||3.79||10.26 fps/W|
|LeEco Le Pro3 (Snapdragon 821)||14LPP||33.04||4.18||7.90 fps/W|
|Galaxy S7 (Snapdragon 820)||14LPP||30.98||3.98||7.78 fps/W|
|Huawei Mate 10 (Kirin 970)||10FF||37.66||6.33||5.94 fps/W|
|Galaxy S8 (Exynos 8895)||10LPE||42.49||7.35||5.78 fps/W|
|Galaxy S7 (Exynos 8890)||14LPP||29.41||5.95||4.94 fps/W|
|Meizu PRO 5 (Exynos 7420)||14LPE||14.45||3.47||4.16 fps/W|
|Nexus 6P (Snapdragon 810 v2.1)||20Soc||21.94||5.44||4.03 fps/W|
|Huawei Mate 8 (Kirin 950)||16FF+||10.37||2.75||3.77 fps/W|
|Huawei Mate 9 (Kirin 960)||16FFC||32.49||8.63||3.77 fps/W|
|Huawei P9 (Kirin 955)||16FF+||10.59||2.98||3.55 fps/W|
The Kirin 960’s GPU power and inefficiency was a direct response to market pressure, as well as negative user feedback regarding GPU performance. I don’t really blame Huawei; I highly praised the Mate 8 with its Kirin 950, irrespective of the lower GPU performance, it was an excellent device because the thermals and sustained performance were outstanding. Despite this, the very first comment of that review was a 'despite the GPU …'. Here the average user will just look at the benchmarks and see it’s ranked lower, and not think any better. It also shows that companies do care what users want, and do listen to requests, but might react in a way users were not expecting.
Unfortunately the only way we can avoid this situation of a perceived performance deficit as a whole is if we as journalists, and companies like Huawei, educate users better. It also helps if device vendors have a more steadfast philosophy about remaining within reasonable power budgets.
Huawei and Its Future
Last Friday Huawei’s CEO announced the new Kirin 980, which is set to be the centerpiece in the Mate 20 lineup coming soon. The big messaging for this new chip is that it is on a new 7nm manufacturing node, and the biggest improvements have been on the GPU side. Huawei has promised power efficiency increases of a staggering 178%. If the math checks out and Kirin 980 devices indeed deliver these figures, then it would mean the company would finally get back to sustainable ~3.5W for GPU workloads, and simultaneously be competitive to some degree.
I’ve already seen a lot of users dismiss the GPU performance of the new SoC. It seemingly, as admitted by Huawei, doesn’t beat the peak performance of the Snapdragon 845, the Qualcomm flagship announced last year. Yet this doesn’t matter, because the efficiency should be better for the new SoC. Because of this, real world sustained performance would be better as well, even if the peak figures don’t quite compete.
Here the only thing I can do is reiterate the balance between performance and efficiency as much as I can, in the hope to shift more people away from the narrative of only looking at peak performance. I’m quite happy with our new GPU testing methodology, because frankly it works – our sustained performance numbers were mostly unaffected by the cheating behaviour. Here I see the sustained scores as a good showcase of performance and efficiency across all devices.
The Honor Play: A Gaming Phone, or Just More Marketing?
Returning to square one, one of the reasons we’ve been analysing Huawei & Honor's phones in this level of detail again is because we've been trying to determine what exactly GPU Turbo is. We've addressed that technology in a separate article, and find that it does have technical merit. Here Huawei tried to compensate for its hardware disadvantages by innovating through software. However, software can only do so much, and Huawei tries to exaggerate the benefits of the new technology on devices like the Honor Play.
Unfortunately I see the reasons for the overzealous marketing of GPU Turbo, and the cheating behaviour of this article, as one and the same: the current SoCs are far behind in graphics performance and efficiency. The reality of things is that currently Qualcomm’s GPU architecture has a major advantage in terms of efficiency, which allows it to reach far higher performance figures.
So Honor is trying to promote the Honor Play as a gaming-centric phone, making bold marketing claims about its performance and experience. This is a quite courageous marketing strategy given the fact that the SoC powering the phone is currently the worst of its generation when it comes to gaming. Here the competition just has a major power efficiency advantage, and there is no way around that.
We actively discourage such marketing strategies as it just tries to pull the wool over user’s eyes. While the Honor Play is a quite good phone in itself, a gaming phone it is not. Here we just hope that in the future we’ll see more responsible and honest marketing, as this summer’s materials were rather, incredible, in the worst sense of the word.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Cicerone - Friday, September 7, 2018 - linkBut sometimes Kirin 970 is on the same level with 2016 Exynos 8890 found on Samsung S7.
shogun18 - Tuesday, September 4, 2018 - link> I think it's important for users to know that the Kirin 970 has a significantly weaker GPU than the S845
How so? If some popular game needs 10,000 shader OPS to run at 800x600 at 30 frames/sec what difference does it make if one SoC can pump out 8000 (admittedly synthetic - are you really going to tell me you're going to notice 24FPS vs 30? pahlease), or 15,000 or another 40,000? Ok, so does OPS/Watt actually matter in anybody's evaluation metric? No. Does anyone choose a phone based on this one lets me run X game for 30 minutes before running out of batt but I can get 40 minutes with this other one because in "game mode" the manufacturer took liberties with wattage?
cfenton - Tuesday, September 4, 2018 - linkWhat modern phone runs at 800x600? Also, faster GPUs can get closer to 60fps, which is definitely a noticeable improvement over 30fps.
If all you're playing is Candy Crush, then it doesn't matter what GPU you have, but if you're playing Fortnite or the upcoming Elder Scrolls game, then GPU performance is important. If two phones are roughly the same price, but one of them has 3x the GPU power with no downsides, I'm going to go with the faster one every time.
shogun18 - Tuesday, September 4, 2018 - linkThe human eye in games like Fortnite etc can only process a very limited frame rate. So anything over 30 is basically pointless. Plus factor in using a 27+ monitor(s) vs a piddly-ass phone screen with lousy (by comparison to "gaming" monitors) refresh characteristics the benchmark is even less useful.
cfenton - Tuesday, September 4, 2018 - linkThat article make it very clear that people can tell the difference between 60fps and 30fps. Its claim is that it's only an improvement in smoothness, not an improvement in our ability to track changes. A higher frame rate won't improve my ability to pick out movement.
60fps looks better than 30fps. If I can choose between the two, at the same resolution, I'm always going to pick 60fps. Will it make me better at the game? No. Does it make the game look at feel better? Yes.
techconc - Monday, September 10, 2018 - link@shogun18 - I always find it amusing when people present "evidence" to support their position only to find out the evidence they are producing very clearly refutes their position. The article very clearly states:
"Certainly 60 Hz is better than 30 Hz, demonstrably better." - Professor Thomas Busey
From my own perspective, I would suggest to you that games need to have a 30 fps at minimum to be playable and to appear to be somewhat fluid. 60 fps is clearly better, but not "twice as good". You can see the difference though. On my iPad, I can do 120 fps on games like World of Tanks Blitz and can even notice that difference. For some games, reaction time is critical and network performance also plays a role in this. However, higher frame rates can indeed provide a competitive advantage.
shogun18 - Tuesday, September 11, 2018 - linkdid you BOTHER to read to the end let alone comprehend what was being put forth? The human brain is SLOW! It's massively parallel but it's SLOW. Just like our ears are crap compared to other creatures who actually have good hearing. If you're playing FPS on a phone you're an idiot to begin with. Fluidity or more properly the perception of same doesn't make your performance better. Your reaction time is also completely shit compared to the theoretical frame rate you think you are perceiving. Anyone who cares about game play on a phone is a moron.
Reflex - Tuesday, September 4, 2018 - linkBuyers should be able to value whatever they wish when making their purchasing decisions. Lying to them denies them the right to make decisions based on the criteria that matter most to them, whether it be nice cameras, great screens, excellent call quality, or yes, 'geekmarks' or whatever.
It's not for you to determine what is most important to a customer, nor is it ethical to lie about one of those or other items in order to trick people who value them into buying your product.
boozed - Tuesday, September 4, 2018 - linkFunny you should say that, considering the reason for the existence of this website.
Samus - Wednesday, September 5, 2018 - linkYou need to put a performance metric on things somehow. Cars have horsepower and torque, batteries have volts and milliamps, and food has protein and carbs.
Unfortunately these metrics do not come from the SoC manufacturer, but the phone vendor. That therein lies the problem. "Overclocking" or boosting a SoC beyond reasonable thermal design limitations is blatant cheating if it can't be sustained throughout, say, a game, that the benchmark is momentarily mimicking.
At the end of the day, this is really an Android problem too, because the freedom the OS gives phone vendors to manipulate the kernel, scheduler, and frequency curve of the CPU/GPU. This kind of flexibility didn't exist (and still doesn't exist) in other mobile operating systems.
So imagine if this were happening in the PC space. Where vendors were selling overclocked systems WITHOUT SAYING they were overclocked. Where vendors were manipulating the real-world benefits of a GPU with software that faked benchmark results.
I would liken it to what happened with the game console clones of the 80's, when there were third-party Atari's, Intellivisions, etc, that had custom CPU's running at higher frequencies. In that case, it actually hurt developers more than consumers (but still hurt consumers) because developers couldn't even depend on a performance metric for the platform they were developing for. This is partially why there were virtually no third party developers (Activision and Hudsonsoft - who later developed their own console simply to have some control over the hardware environment! - were effectively the first cross-platform developers.)