Mea-Culpa: It Should Have Been Caught Earlier

Section By Andrei Frumusanu

As stated on the previous page, I had initially had seen the effects of this behaviour back in January when I was reviewing the Kirin 970 in the Mate 10. The numbers I originally obtained showed worse-than-expected performance of the Mate 10, which was being beaten by the Mate 9. When we discussed the issue with Huawei, they attributed it to a firmware bug, and pushed me a newer build which resolved the performance issues. At the time, Huawei never discussed what that 'bug' was, and I didn't push the issue as performance bugs do happen.

For the Kirin 970 SoC review, I went through my testing and published the article. Later on, in the P20 reviews, I observed the same lower performance again. As Huawei had told me before it was a firmware issue, I had also attributed the bad performance to a similar issue, and expected Huawei to 'fix' the P20 in due course.

Looking back in hindsight, it is pretty obvious there’s been some less than honest communications with Huawei. The newly detected performance issues were not actually issues – they were actually the real representation of the SoC's performance. As the results were somewhat lower, and Huawei was saying that they were highly competetive, I never would have expected these numbers as genuine.

It's worth noting here that I naturally test with our custom benchmark versions, as they enable us to get other data from the tests than just a simple FPS value. It never crossed my mind to test the public versions of the benchmarks to check for any discrepancy in behaviour. Suffice to say, this will change in our testing in the future, with numbers verified on both versions.

Analyzing the New Competitive Landscape

With all that being said, our past published results for Kirin 970 devices were mostly correct - we had used a variant of the benchmark that wasn’t detected by Huawei’s firmware. There is one exception however, as we weren't using a custom version of 3DMark at the time. I’ve now re-tested 3DMark, and updated the corresponding figures in past reviews to reflect the correct peak and sustained performance figures.

As far as I could tell in my testing, the cheating behaviour has only been introduced in this year’s devices. Phones such as the Mate 9 and P10 were not affected. If I’m to be more precise, it seems that only EMUI 8.0 and newer devices are affected. Based on our discussions with Huawei, we were told that this was purely a software implementation, which also corroborates our findings.

Here is the competitive landscape across our whole mobile GPU performance suite, with updated figures where applicable. We are also including new figures for the Honor Play, and the new introduction of the GFXBench 5.0 Aztec tests across all of our recent devices:

3DMark Sling Shot 3.1 Extreme Unlimited - Graphics 

3DMark Sling Shot 3.1 Extreme Unlimited - Physics 

GFXBench Aztec Ruins - High - Vulkan/Metal - Off-screen GFXBench Aztec Ruins - Normal - Vulkan/Metal - Off-screen 

GFXBench Manhattan 3.1 Off-screen 

GFXBench T-Rex 2.7 Off-screen

Overall, the graphs are very much self-explanatory. The Kirin 960 and Kirin 970 are lacking in both performance and efficiency compared almost every device in our small test here. This is something Huawei is hoping to address with the Kirin 980, and features such as GPU Turbo.

Raw Benchmark Numbers The Reality of Silicon And Market Pressure
POST A COMMENT

84 Comments

View All Comments

  • beginner99 - Wednesday, September 5, 2018 - link

    The most interesting aspect is that it shows that ARM also struggles with power once they get into x86 performance area. No free lunch. And I wonder how the other devices cheat. Probably most due somehow. Huawei just wan't that clever. Reply
  • ncsaephanh - Wednesday, September 5, 2018 - link

    Great work on this piece. I really appreciate good journalism giving light to industry issues while having the technical expertise to dive deep and explain everything in a concise manner. And I wouldn't worry about catching this earlier, what's important is we know now. And hopefully at least some consumers now won't fall for the marketing/benchmarking hype. Reply
  • yhselp - Wednesday, September 5, 2018 - link

    The GFXBench T-Rex Offscreen Power Efficiency benchmark in the Kirin 970 piece still shows the cheating result for the Mate 10.

    It's astonishing to see the difference in sustained performance cooling alone can attribute for - P20 Pro and Honor Play have the same maker, same SoC, similar dimentions, and yet, the performance is quite different.
    Reply
  • Hyper72 - Wednesday, September 5, 2018 - link

    I thought that ever since Samsung was caught doing the same thing in 2013 you put in active countermeasures (randomly named benchmark software, etc.) or at least a test for cheating as a standard part of your setup? Reply
  • tommo1982 - Wednesday, September 5, 2018 - link

    These tests show similar behavior with iPhone. It's not any faster than the other leading brands. The difference between peak and sustained is huge. Same goes for Samsung and Xiaomi.

    I understand why the UI seems so fast and responsive, and why many people complained about the performance. It just can't stay at peak forever.
    Reply
  • eastcoast_pete - Thursday, September 6, 2018 - link

    To clarify up front: I don't own or like iOS devices. However, I have to give Apple its due here: the idea of really high, short burst performance coupled with okay longer-term speed is pretty much what I (and probably many other mobile users) want in smartphones. This is useful for multitasking while opening multiple browser windows etc., i.e. scenarios that really benefit from well above-normal CPU/GPU speeds for the few seconds, resulting in a fluid user experience. This is different from running the SoC to heat exhaustion and shutdown whenever a benchmarking app is recognized. Some current Android flagships are sort-of able to do that short burst ("turbo" in PCs) also, but none has yet the (momentary) peak performance of Apple's wide and deep cores. The Mongoose M3 was an attempt, the Kirin 980 was an apparent step towards this, sort of, but is now marred by this benchmark cheating BS. Let's see what QC can cook up, they tend to get closest to Apple's top SoC. Reply
  • techconc - Monday, September 10, 2018 - link

    Thermal throttling happens on ALL phones. That's not what's in question. The issue is with companies that artificially white list specific benchmarks in order to achieve results that would not be seen in real applications.

    To that end, Anandtech's battery tests have always demonstrated the difference between peak and sustained performance in mobile devices. Up through the iPhone 6s, there was very little throttling going on with iPhones on peak loads. To your point, the level of throttling in iPhones has been approaching practices of common Android equivalents.
    Reply
  • psychobriggsy - Thursday, September 6, 2018 - link

    Naughty. Makes running a benchmark in a 'loop mode' until the battery runs out very important IMO. If the device dies in an hour in benchmarks, but 3 hours elsewhere, then you know something's awry.

    However there is a potential positive - it shows that the Kirin 970 can perform well at higher power consumption - there's no performance wall between 3.5W and 9W, and the perf/W scales fairly well too.

    So - why not look into a 'docked' mode option in the future? One option could be a Switch-like dock, using external power (to protect the battery), optional cooling assistance, HDMI out to a TV, provide a controller in this pack as well, and allow the SoC to run as fast as this setup can keep the device from damaging itself. That's flippin' marketable. The dock would cost a few dollars, and it sounds like the software is already there in the main.

    Hopefully the Mali G76 in the Kirin 980 actually fixes a lot of the performance issues with Mali, which surely were a factor in this sad situation (also clearly saving money by using a smaller GPU, wide and slow beats narrow and fast for GPUs where power consumption matters.
    Reply
  • hanselltc - Friday, September 7, 2018 - link

    wut if: the white list includes popular games as well? is that still cheating? Reply
  • s.yu - Monday, September 10, 2018 - link

    Obviously you haven't read the article, the so-called whitelisting's performance can't be sustained, it's not as simple as merely activating some sort of game mode automatically. Reply

Log in

Don't have an account? Sign up now