Flash storage, especially Nonvolatile Memory Express (NVMe) flash, is becoming the gold standard for high-performance primary storage for modern applications. According to Gartner, NVMe based storage promises 10 times higher input/output operations (IOPS) per second and 5 times lower latency than the present generation of solid-state storage solutions. The most common deployment platform for NVMe is a shared storage system connected over a high-speed fabric. However, it is anticipated that these NVMe Solid State Drives (SSD’s) will not only replace the SATA SSD’s in current Solid State Array designs, but that they will also be available in place of SSD’s being currently deployed as direct-attached storage in servers. The key goal of the DriveScale Composable Flash solution is to provide an alternative to direct-attached NVMe flash storage, while maintaining the same performance and improvements in Total Cost of Ownership (TCO).
Solid State Drives are a cornerstone of the performant storage market in a variety of applications. The SSD market is well established and growing at a faster rate than conventional hard drives. However, current SSD technology has leveraged hard drive protocols and form factors that resulted in limitations on speed and efficiency. The NVMe standard removes these performance limitations to drive a significant jump in performance. There are a number of different modern workloads that are deployed with SSD’s populated in servers, either as add-in cards that are plugged in to the server’s PCIe bus, or in the 2.5” drive form factor. Examples of such workloads include cloud-native applications such as NoSQL, Spark, Splunk, Elastic Search and a lot of newer implementations in Linux containers.
These modern workloads have two important characteristics in common: they all run on commodity x86 server hardware and they all utilize direct-attached storage. Solid State Arrays are almost never used in such applications as they are expensive and add layers of storage functionality that are redundant from what is already built into the cloud-native applications.
When SSD’s are deployed inside servers,the storage-to-compute ratio is fixed before running the workload. This results in inflexibility and an inability to change the ratio as application needs change. In order to err on the side of caution, this also usually results in over-provisioning, so most customers end up buying more storage than they need. In addition, each year the optimal purchase point in terms of the lowest dollar per terabyte ($/TB)keeps moving to higher capacity drives. Therefore, customers end up either paying more in $/TB to populate servers with smaller capacity drives, or worse, they purchase larger drives than they need in order to get the optimal dollar per terabyte. This results in over-provisioning and poor utilization.
DriveScale’s SCI for Flash ‘composes’ the NVMe drives and presents them to servers on the same network as locally-attached drives for purposes of cloud-native workloads. The solution can present entire drives or a slice of a drive, according to application requirements. Each drive or drive slice is presented as a unique block device with the throughput of NVMe flash storage. The drive assignments are managed from the centralized DriveScale Composer and are modifiable on demand, so that application tuning and processor upgrades are seamless.
The EBOFs and servers are connected via iSCSI or RDMA over Converged Ethernet (RoCE v2). RoCE v2 does require specialized network cards on the client side and Ethernet switches that are capable of supporting this new protocol.
The benefits of DriveScale’s SCI for Flash include:
- The ability to take advantage of commodity pricing on flash drives and utilize drives that provide the best price-to-capacity ratio
- Improved overall resiliency and recovery from system failures
- Optimization of storage per server or container, instead of over or under-provisioning
- Separation of storage and compute lifecycles that eliminates waste during upgrade servers by preserving expensive flash storage investments
- The ability to ‘pay as you grow’ by purchasing storage only as and when needed
How It Works
DriveScale’s SCI for Flash solution includes EBOFs, agent software and the DriveScale Composer. If using DriveScale’s EBOF, then the solution also includes the DriveScale NVMe Appliance (a hardware component). The DriveScale NVMe Appliance is a dual-node X86 system that houses PCIe attached NVMe dual-ported flash drives. Each drive in the system is accessible and can be controlled by either node. These dual-ported flash drives are available from several manufacturers including Western Digital, Intel and DriveScale. Each X86 controller node is populated with single-socket or dual-socket Intel Xeon processors and sufficient memory to runthe DriveScale Data Engine. Each controller node also has two to four 100 Gbit Ethernet ports capable of supporting multiple transport protocols, including iSCSI and RoCE v2.
Agent software is installed on the client servers to communicate with the host iSCSI or RoCE stack and present the NVMe devices as either iSCSI block devices or as locally attached NVMe devices. In addition, DriveScale requires the installation of the DriveScale Composer on hosts or virtual machines outside of the client cluster. The systems running the Composer are required to be connected to a common ‘management’ network used for out-of-band communication with the client servers and EBOFs. An example design is shown in the figure.
Flash storage arrays are not ideal for use in modern, distributed cloud-native workloads as these systems come burdened with redundant features at a very high cost. Direct-attached SSD’s are also not suitable as they result in inflexibility and gross under-utilization. DriveScale allows cloud infrastructure deployments of applications like NoSQL, Spark and containers to take full advantage of commodity NVMe Flash storage with the best price-performance by incorporating it in a Composable Infrastructure for Flash solution.