Caching And Tiering: Intel Optane Memory H20 and Enmotus FuzeDrive SSD Reviewedby Billy Tallis on May 18, 2021 2:00 PM EST
An Alternative: Enmotus FuzeDrive SSD
Enmotus is a well-established commercial vendor of storage management software. Their existing FuzeDrive software is a hardware-independent competitor to Intel's RST and Optane Memory software. At CES 2020, Enmotus announced their first hardware product: MiDrive, an SSD combining QLC and SLC using Enmotus FuzeDrive software. This eventually made it to market as the FuzeDrive SSD, more closely matching the branding of their software products.
Like the Intel Optane Memory H20, the FuzeDrive SSD is almost two drives in one: a small SLC SSD and a large QLC SSD. But Enmotus implements it in a way that avoids all the compatibility limitations of the Optane Memory H20. The hardware is that of a standard 1 or 2 TB QLC SSD using the Phison E12S controller—the same as a Sabrent Rocket Q or Corsair MP400. The SSD's firmware does some very non-standard things behind the scenes: a fixed portion of the drive's NAND is set aside to permanently operate as SLC. This pool of NAND is wear-leveled independently from the QLC portion of the drive. The host system sees device with one pool of storage, but the first 24GB or 128GB of logical block addresses are mapped to the SLC part of the drive and the rest is the QLC portion. The Enmotus FuzeDrive software abstracts over this to move data in and out of the SLC portion.
Enmotus FuzeDrive does tiered storage rather than caching: the faster SLC portion adds to the total usable capacity of the volume, rather than just being a temporary home for data that will eventually be copied to the slower device. By contrast, putting a cache drive in front of a slower device using Intel's caching solution doesn't increase usable capacity; it just improves performance.
As an extra complication to the FuzeDrive SSD, the QLC portion of the drive operates exactly like a regular consumer QLC SSD, albeit with an unusual capacity. That means the QLC portion has its own drive-managed dynamic SLC caching that is entirely separate from the static SLC portion at the beginning of the drive.
Boot support is achieved by installing a UEFI driver module that the motherboard firmware loads and uses to access the tiered storage volume where the OS resides. Intel ships a comparable UEFI implementation of their caching system as part of the motherboard firmware, whereas Enmotus needs to install it separately to the SSD's EFI System Partition. Some NVMe RAID solutions such as from HighPoint put their UEFI driver in an option ROM.
|Enmotus FuzeDrive P200 SSD Specifications|
|Form Factor||double-sided M.2 2280|
|NAND Flash||Micron 96L 1Tbit QLC|
|QLC NAND Capacity||814 GiB||1316 GiB|
|Fixed SLC NAND Capacity||24 GiB||128 GiB|
|Total Usable Capacity||838 GiB
|Sequential Read||3470 MB/s|
|Sequential Write||2000 MB/s||3000 MB/s|
|Retail Price||$199.99 (22¢/GB)||$349.99 (23¢/GB)|
The performance specs for the Enmotus FuzeDrive P200 are nothing special; after all, advertised performance for ordinary consumer SSDs is already based on the peak performance attainable from the drive-managed SLC cache. The FuzeDrive SSD can't really aspire to offer much better peak performance than mainstream NVMe SSDs. Rather, the host-managed tiering instead of drive-managed caching changes the dynamics of when and how long a real-world workload will experience that peak performance from SLC NAND. The ability to manually mark certain files as permanently resident in the SLC portion of the drive means most of the unpredictability of consumer SSD performance can be eliminated. This is possible with both the Enmotus FuzeDrive SSD and with Intel Optane Memory caching (when configured in the right mode), but the larger FuzeDrive SSD model's 128GB SLC portion can accommodate a much wider range of applications and datasets than a 32GB Optane cache, even if the latter does have lower latency.
Write endurance is a bit more complicated when there are two drives in one, because in principle it is possible to wear out one section of the drive before the other. Intel sidesteps this question by making the Optane Memory H20 an OEM-only drive, so the warranty is whatever the PC vendor feels like offering for the system as a whole. Enmotus is selling their drive direct to consumers, so they need to be a bit more clear about warranty terms. Ultimately, the drive's own SMART indicators for wear are what determines whether the FuzeDrive SSD is considered to have reached its end of life. Enmotus has tested the FuzeDrive SSD and their software against the JEDEC-standard workloads used for determining write endurance, and from that they've extrapolated the above estimated write endurance numbers that should be roughly comparable to what applies to traditional consumer SSDs. Unusual workloads or bypassing the Enmotus tiering software could violate the above assumptions and lead to a different total lifespan for the drive.
The estimated write endurance figures for the FuzeDrive SSD look great for a QLC drive, and getting more than 1 DWPD as on the 1.6TB model is good even by the standards of high-end consumer SSDs. The tiering strategy used by FuzeDrive will tend to produce less data movement than caching as done by Intel's Optane Memory, and the SLC portion of the FuzeDrive SSD is rated for 30k P/E cycles. So it really is plausible that the 1.6TB model could last for 3.6 PB of carefully-placed writes, despite using QLC NAND for the bulk of the storage.
Caching makes storage benchmarking harder, by making current performance depend highly on previous usage patterns. Our usual SSD test suite is designed to account for ordinary drive-managed SLC caching, and includes tests intended to stress just a drive's cache as well as tests designed to go beyond the cache and reveal the performance of the slower storage behind the cache.
Software-managed caching and tiering make things even harder. The Intel Optane Memory and Enmotus FuzeDrive software is Windows-only, but large parts of our test suite use Linux for better control and lower overhead. There are SSD caching software solutions for Linux, but they come with their own data placement algorithms and heuristics that are entirely different from what Intel and Enmotus have implemented in their respective drivers, so testing bcache or lvmcache on Linux would not provide useful information about how the Intel and Enmotus drivers behave.
Our ATSB IO trace tests bypass the filesystem layer and deal directly with block devices, so caching/tiering software cannot do file-level tracking of hot data during those tests. Even if we could get these tests to run on top of software-managed caching or tiering, we'd be robbing the software of valuable information it could use to make smarter decisions than a purely drive-managed cache.
All of our regular SSD test suite is set up to have the drive under test as a secondary drive, with the testbed's OS and benchmarking software running off a separate boot drive. For the Optane Memory H20, Intel has provided a laptop that only has one M.2 slot, so testing the H20 as a secondary drive would be a bit inconvenient.
For all of these reasons, we're using a slightly different testing strategy and mix of benchmarks for this review. Where possible, we've tested the individual components on our regular test suite without the caching/tiering software. Our regular AMD Ryzen testbed detects the Optane side of the H20 and H10 when they are installed into the M.2 slots, so we've tested those with our usual synthetic tests to assess how much extra performance Intel is really getting out of the newer Optane device. The FuzeDrive SSD was partitioned and the SLC and QLC partitions tested independently. Our power measurements for these tests are still for the whole M.2 card even when only using part of the hardware.
The synthetic benchmarks tell us the performance characteristics of the fast and slow devices that the storage management software has to work with, but we need other tests to show how the combination behaves with the vendor-provided caching or tiering software. For this, we're using two suites of application benchmarks: BAPCo SYSmark 25 and UL PCMark 10. These tests cover common consumer PC usage scenarios and the scores are intended to reflect overall system performance. Since most consumer workloads are relatively lightweight from a storage perspective, there isn't much opportunity for faster storage to bring a big change in these scores. (Ironically, the process of installing SYSmark 25 would make for a much more strenuous storage benchmark than actually running it, but the installer unfortunately does not have a benchmark mode.)
To look a bit closer at storage performance specifically while using the caching or tiering software, we turn to the PCMark 10 Storage tests. These are IO trace based tests like our ATSB tests, but they can be run on an ordinary filesystem and don't bypass or interfere with caching or tiering software.
Since the Intel Optane Memory H20 is only compatible with select Intel platforms, we're using Intel-provided systems for most of the testing in this review. Intel shipped our H20 sample preinstalled in a HP Spectre x360 15-inch notebook, equipped with a Tiger Lake processor, 16GB of RAM and a 4k display. We're using the OS image Intel preloaded, which included their drivers plus PCMark and a variety of other software and data to get the drive roughly half full, so that not everything can fit in the cache. For other drives, we cloned that image, so software versions and configurations match. We have also run some tests on the Whiskey Lake notebook Intel provided for the Optane Memory H10 review in 2019.
|Optane Memory Review Systems|
|Platforn||Tiger Lake||Whiskey Lake|
|CPU||Intel Core i7-1165G7||Intel Core i7-8565U|
|Motherboard||HP Spectre x360 15.6"||HP Spectre x360 13t|
|Memory||16GB DDR4-2666||16GB DDR4-2400|
|Power Supply||HP 90W||HP 65W USB-C|
|OS||Windows 10 20H2, 64-bit|
A few things are worth noting about this Tiger Lake notebook: While the CPU provides some PCIe 4.0 lanes, this machine doesn't let them run beyond PCIe 3.0 speed. The DRAM running at just DDR4-2666 also falls far short of what the CPU should be capable of (DDR4-3200 or LPDDR4-4266). The price of this system as configured is about $1400, which really should be enough to get a machine that comes closer to using the full capabilities of its main components. There's also a ridiculous amount of coil whine while it's booting.