Test Configurations

So while the Intel SSD DC P4800X is technically launching today, 3D XPoint memory is still in short supply. Only the 375GB add-in card model has been shipped, and only as part of an early limited release program. The U.2 version of the 375GB model and the add-in card 750GB model are planned for a Q2 release, and the U.2 750GB model and the 1.5TB model are expected in the second half of 2017. Intel's biggest enterprise customers, such as the Super Seven, have had access to Optane devices throughout the development process, but broad retail availability is still a little ways off.

Citing the current limited supply, Intel has taken a different approach to review sampling for this product. Their general desire for secrecy regarding the low-level details of 3D XPoint has also likely been a factor. Instead of shipping us the Optane SSD DC P4800X to test on our own system, as is normally the case with our storage testing, this time around Intel has only provided us with remote access to a DC P4800X system housed in their data center. Their Non-Volatile Memory Solutions Group maintains a pool of servers to provide partners and customers with access to the latest storage technologies and their software partners have been using these systems for months to develop and optimize applications to take advantage of Optane SSDs.

Intel provisioned one of these servers for our exclusive use during the testing period, and equipped it with a 375GB Optane SSD DC P4800X and a 800GB SSD DC P3700 for comparison. The P3700 was the U.2 version of the drive and was connected through a PLX PEX 9733 PCIe switch. The Optane SSD under test was initially going to be a U.2 version connected to the same backplane, but Intel found that the PCIe switch was introducing some inconsistency in the access latency on the order of a microsecond or two, which is a problem when trying to benchmark a drive with ~8µs best case latency. Intel swapped out the U.2 Optane SSD for an add-in card version that uses PCIe lanes direct from the processor, but the P3700 was still potentially subject to whatever problems the PCIe switch may have caused. Clearly, there's some work to be done to ensure the ecosystem is ready to take full advantage of the performance promised by Optane SSDs, but debugging such issues is beyond the scope of this review.

Intel NSG Marketing Test Server
CPU 2x Intel Xeon E5 2699 v4
Motherboard Intel S2600WTR2
Chipset Intel C612
Memory 256GB total, Kingston DDR4-2133 CL11 16GB modules
OS Ubuntu Linux 16.10, kernel 4.8.0-22

The system was running a clean installation of Ubuntu 16.10, with no Intel or Optane-specific software or drivers installed, and the rest of the system configuration was as expected. We had full administrative access to tweak the software to our liking, but chose to leave it mostly in its default state.

Our benchmarking is a variety of synthetic workloads generated and measured using fio version 2.19. There are quite a few operating system and fio options that can be tuned, but we generally ignored them: for example the NVMe driver wasn't manually switched to polling mode, or the CPU affinity was not manually set, and nothing was tweaked about power management or CPU clock speed turbo. There is work underway to switch fio over to using nanosecond-precision time measurement, but it has not reached a usable state yet. Our tests only record latencies in microsecond increments, and mean latencies that report fractional microseconds are just weighted averages of eg. how many operations were closer to 8µs than 9µs.

All tests were run directly on the SSD with no intervening filesystem. Real-world applications will almost always be accessing the drive through a filesystem, but will also be benefiting from the operating system's cache in main RAM, which is bypassed with this testing methodology.

To provide an extra point of comparison, we also tested the Micron 9100 MAX 2.4TB on one of our systems, using a Xeon E3 1240 v5 processor. In order to not unfairly disadvantage the Micron 9100, most of the tests  were limited to use at most 4 threads. Our test system was running the same Linux kernel as the Intel NSG marketing test server and used a comparable configuration with the Micron 9100 connected directly to the CPU's PCIe lanes rather than through the PCH.

AnandTech Enterprise SSD Testbed
CPU Intel Xeon E3 1240 v5
Motherboard ASRock Fatal1ty E3V5 Performance Gaming/OC
Chipset Intel C232
Memory 4x 8GB G.SKILL Ripjaws DDR4-2400 CL15
OS Ubuntu Linux 16.10, kernel 4.8.0-22

Because this was not a hands-on test of the Optane SSD on our own equipment, we were unable to conduct any power consumption measurements. Due to the limited time available for testing, we were unable to make any systematic test of write endurance or the impact of extra overprovisioning on performance. We hope to have the opportunity to conduct a full hands-on review later in the year to address these topics.

Due to time, we were unable to cover Intel's new Memory Drive Technology software. This is an optional software add-on that can be purchased with the Optane SSD. The Memory Drive Technology software is a minimal virtualization system that allows software to pretend that their Optane SSD is RAM. The hypervisor will present to the guest OS a pool of memory equal to the amount of available DRAM plus up to 320GB of the Optane SSD's 375GB capacity. The hypervisor manages the placement of data to automatically cache hot data in DRAM, such that applications or the guest OS cannot explicitly address or allocate Optane storage. We may get a chance to look at this in the future, as it offers an interesting aspect of the new ways multi-tiered storage will be affecting the Enterprise market over the next few years.

3D XPoint Refresher Checking Intel's Numbers
POST A COMMENT

117 Comments

View All Comments

  • melgross - Tuesday, April 25, 2017 - link

    You're making the mistake those who know nothing make, which is surprising for you. This is a first generation product. It will get much faster, and much cheaper as time goes on. NAND will stagnate. You also have to remember that Intel never made the claim that this was as fast as RAM, or that it would be. The closest they came was to say that this would be in between NAND and RAM in speed. And yes, for some uses, it might be able to replace RAM. But that could be several generations down the road, in possibly 5 years, or so. Reply
  • tuxRoller - Sunday, April 23, 2017 - link

    I'm not sure i understand you.
    You talk about "pages", but, i hope, the reviewer was only using dio, so there would be no page cache.
    It's very unclear where you are getting this "~100x" number. Nvme connected dram has a plurality of hits around 4-6 us (depending on software) but it also has a distributed latency curve. However, i don't know what the latency at the 99.999% percentile. The point is that even with dram's sub-100ns latency, it's still not staying terribly close to the theoretical min latency of the bus.
    Btw, it's not just the controller. A very large amount of latency comes from the block layer itself (amongst other things).
    Reply
  • Santoval - Tuesday, June 6, 2017 - link

    It is quite possible that Intel artificially weakened P4800X's performance and durability in order to avoid internal competition with their SSD division (they already did the same with Atoms). If your new technology is *too* good it might make your other more mainstream technology look bad in comparison and you could see a big drop in sales. Or it might have a "deflationary" effect, where their customers might delay buying in hope of lower prices later. This way they can also have a more clear storage hierarchy, business segment wise, where their mainstream products are good, and their niche ones are better but not too good.

    I am not suggesting that it could ever compete with DRAM, just that the potential of 3D XPoint technology might actually be closer to what they mentioned a year ago than the first products they shipped.
    Reply
  • albert89 - Friday, April 21, 2017 - link

    Intel wont be reducing the price of the optane but rather will be giving the average consumer a watered down version which will be charged at a premium but perform only slightly better then the top SSD. The conclusion ? Another over priced ripoff from Intel. Reply
  • TheinsanegamerN - Thursday, April 20, 2017 - link

    the fastest SSD on the consumer market is the 960 pro, which can hit 3.2GB/s read under certain circumstances.

    This is the equivalent of single channel DDR 400 from 2001. and DDR had far lower latencys to boot.

    We are a long, long way from replacing RAM with storage.
    Reply
  • ddriver - Friday, April 21, 2017 - link

    What makes the most impression is it took a completely different review format to make this product look good. No doubt strictly following intel's own review guidelines. And of course, not a shred of real world application. Enter hypetane - the paper dragon. Reply
  • ddriver - Friday, April 21, 2017 - link

    Also, bandwidth is only one side of the coin. Xpoint is 30-100+ times more latent than dram, meaning the CPU will have to wait 30-100+ times longer before it has data to compute, and dram is already too slow in this aspect, so you really don't want to go any slower.

    I see a niche for hypetane - ram-less systems, sporting very slow CPUs. Only a slow CPU will not be wasted on having to wait on working memory. Server CPUs don't really need to crunch that much data either, if any, which is paradoxical, seeing how intel will only enable avx512 on xeons, so it appears that the "amazingly fast" and overpriced hypetane is at home only in simple low end servers, possibly paired with them many core atom chips. Even overpriced, it will kind of a decent deal, as it offers about 3 times the capacity per dollar as dram, paired with wimpy atoms it could make for a decent simple, low cost, frequent access server.
    Reply
  • frenchy_2001 - Friday, April 21, 2017 - link

    You are missing the usefulness of it entirely.
    Yes, it is a niche product.
    And I even agree, intel is hyping it and offering it for consumer with minimal benefit (beside intel's bottom line).
    But it realistically slots between NAND and DRAM.
    This review shows that it has lower latency than NAND and it has higher density than DRAM.
    This is the play.

    You say it cannot replace DRAM and for most usage (by far) you are true. However, for a small niche that works with very big data sets (like for finace or exploration), having more memory, although slower, will still be much faster than memory + swap (to a slower NAND storage).

    Let me repeat, this is a niche product, but it has its uses.
    Intel marketing is hyping it and trying to use it where its tradeoffs (particularly price) make little sense, but the technology itself is good (if limited).
    Reply
  • wumpus - Sunday, April 23, 2017 - link

    Don't be so sure that latency is keeping it from being used as [secondary] main memory. A 4GB machine can actually function (more or less) for office duty and some iffy gaming capability. I'd strongly suspect that a 4-8GB stack of HBM (preferably the low-cost 512 bit systems, as the CPU really only wants 512bit chunks of memory at a time) with the rest backed by 3dxpoint would still be effective at this high latency. Any improvement is likely to remove latency as something that would stop it (and current software can use the current stack [with PCIe connection] to work 3dxpoint as "swappable ram").

    The endurance may well keep this from happening (it is on par with SLC).

    The other catch is that this is a pretty steep change along the entire memory system. Expect Intel to have huge internal fights as to what the memory map should look like, where the HBM goes (does Intel pay to manufacture an expensive CPU module or foist it on down the line), do you even use HBM (if Ravenridge does, I'd expect that Intel would have to if they tried to use xpoint as main memory)? The big question is what would be the "cache line" of the DRAM memory: the current stack only works with 4k, the CPU "wants" 512 bits, HBM is closer to 4k. 4k looks like a no-brainer, but you still have to put a funky L5/buffer that deals with the huge cache line or waste a ton of [top level, not sure if L3 or L4] cache by giving it 4k cache lines.
    Reply
  • melgross - Tuesday, April 25, 2017 - link

    What is it with you and RAM? This isn't a RAM replacement for most any use. Intel hasn't said that it is. Why are you insisting on comparing it to RAM? Reply

Log in

Don't have an account? Sign up now