Software Composable Infrastructure for Flash

Introduction

Flash storage, especially Nonvolatile Memory Express (NVMe) flash, is becoming the gold standard for high-performance primary storage for modern applications. According to Gartner, NVMe based storage promises 10 times higher input/output operations (IOPS) per second and 5 times lower latency than the present generation of solid-state storage solutions. The most common deployment platform for NVMe is a shared storage system connected over a high-speed fabric. However, it is anticipated that these NVMe Solid State Drives (SSD’s) will not only replace the SATA SSD’s in current Solid State Array designs, but that they will also be available in place of SSD’s being currently deployed as direct-attached storage in servers. The key goal of the DriveScale Composable Flash solution is to provide an alternative to direct-attached NVMe flash storage, while maintaining the same performance and improvements in Total Cost of Ownership (TCO).

Overview

Solid State Drives are a cornerstone of the performant storage market in a variety of applications. The SSD market is well established and growing at a faster rate than conventional hard drives. However, current SSD technology has leveraged hard drive protocols and form factors that resulted in limitations on speed and efficiency. The NVMe standard removes these performance limitations to drive a significant jump in performance. There are a number of different modern workloads that are deployed with SSD’s populated in servers, either as add-in cards that are plugged in to the server’s PCIe bus, or in the 2.5” drive form factor. Examples of such workloads include cloud-native applications such as NoSQL, Spark, Splunk, Elastic Search and a lot of newer implementations in Linux containers.

These modern workloads have two important characteristics in common: they all run on commodity x86 server hardware and they all utilize direct-attached storage. Solid State Arrays are almost never used in such applications as they are expensive and add layers of storage functionality that are redundant from what is already built into the cloud-native applications.

When SSD’s are deployed inside servers,the storage-to-compute ratio is fixed before running the workload. This results in inflexibility and an inability to change the ratio as application needs change. In order to err on the side of caution, this also usually results in over-provisioning, so most customers end up buying more storage than they need. In addition, each year the optimal purchase point in terms of the lowest dollar per terabyte ($/TB)keeps moving to higher capacity drives. Therefore, customers end up either paying more in $/TB to populate servers with smaller capacity drives, or worse, they purchase larger drives than they need in order to get the optimal dollar per terabyte. This results in over-provisioning and poor utilization.

DriveScale’s Software Composable Infrastructure (SCI) for Flash solution allows enterprises to always right-size their storage because the flash drives are never attached directly to servers, and instead, are provisioned from a shared pool. DriveScale utilizes new Ethernet-attached Bunch Of Flash (EBOF) systems coming to market from multiple storage vendors. These systems are typically designed with dual x86 controllers and multiple 100 Gbit Ethernet links with 24-48 dual-ported NVMe flash drives.

DriveScale’s SCI for Flash ‘composes’ the NVMe drives and presents them to servers on the same network as locally-attached drives for purposes of cloud-native workloads. The solution can present entire drives or a slice of a drive, according to application requirements. Each drive or slice is presented as a unique block device with the throughput of NVMe flash storage. The drive assignments are managed from the centralized DriveScale Management System and are modifiable on demand, so that application tuning and processor upgrades are seamless.

The EBOFs and servers are connected via iSCSI or RDMA over Converged Ethernet (RoCE v2). RoCE v2 does require specialized network cards on the client side and Ethernet switches that are capable of supporting this new protocol.

Benefits

The benefits of DriveScale’s SCI for Flash include:

  • The ability to take advantage of commodity pricing on flash drives and utilize drives that provide the best price-to-capacity ratio
  • Improved overall resiliency and recovery from system failures
  • Optimization of storage per server or container, instead of over or under-provisioning
  • Separation of storage and compute lifecycles that eliminates waste during upgrade servers by preserving expensive flash storage investments
  • The ability to ‘pay as you grow’ by purchasing storage only as and when needed

How It Works

DriveScale’s SCI for Flash solution includes EBOFs, agent software and the DriveScale Management System. If using DriveScale’s EBOF, then the solution also includes the DriveScale Composable Flash System (a hardware component). The EBOF is a dual-node X86 system that houses PCIe attached NVMe dual-ported flash drives. Each drive in the system is accessible and can be controlled by either node. These dual-ported flash drives are available from several manufacturers including Western Digital, Intel and DriveScale. Each X86 controller node is populated with single-socket or dual-socket Intel Xeon processors and sufficient memory to runthe DriveScale Data Engine. Each controller node also has two to four 100 Gbit Ethernet ports capable of supporting multiple transport protocols, including iSCSI and RoCE v2.

Client or Host servers where flash storage is to be assigned are X86 Linux systems. DriveScale recommends that these servers contain at least two 25 Gbit Ethernet ports. Top of the rack network switches with 25 Gbit links to the client servers and 100 Gbit links to the EBOF’s are also required. The Linux client servers should run Ubuntu 14.04, Ubuntu 16.04, CentOS 6/7, RedHat Enterprise Linux 6/7 or SuSE Linux 12 SP3 versions. If RoCE v2 is chosen, the client OS support is limited to Ubuntu 16.04 or CenOS/RHEL 7 only. Also, RoCE v2 support requires that the client servers be installed with the Mellanox RoCE drivers and Mellanox ConnectX network cards.

Agent software is installed on the client servers to communicate with the host iSCSI or RoCE stack and present the NVMe devices as either iSCSI block devices or as locally attached NVMe devices. In addition, DriveScale requires the installation of the DriveScale Management System (DMS) on hosts or virtual machines outside of the client cluster. The systems running the DMS are required to be connected to a common ‘management’ network used for out-of-band communication with the client servers and EBOFs. An example design is shown in the figure below.

Conclusion

Flash storage arrays are not ideal for use in modern, distributed cloud-native workloads as these systems come burdened with redundant features at a very high cost. Direct-attached SSD’s are also not suitable as they result in inflexibility and gross under-utilization. DriveScale allows cloud infrastructure deployments of applications like NoSQL, Spark and containers to take full advantage of commodity NVMe Flash storage with the best price-performance by incorporating it in a Software Composable Infrastructure for Flash solution.

About DriveScale

DriveScale is the leader in Software Composable Infrastructure for modern workloads. Our innovative data center solution empowers IT to disaggregate compute and storage resources and quickly and easily recompose them to meet the needs of the business. Enterprises can respond faster to changing application environments, maximize the efficiency of their assets, and save on equipment and operating expenses. DriveScale supports modern workloads such as Hadoop, Spark, Kafka, NoSQL, Cassandra, Docker, Kubernetes and other distributed applications at a fraction of the cost of alternative platforms. DriveScale, based in Sunnyvale, CA, is founded by technologists with deep roots in IT architecture that built enterprise-class systems for Cisco and Sun Microsystems. Investors include Pelion Venture Partners, Nautilus Venture Partners and Ingrasys, a wholly owned subsidiary of Foxconn. Visit www.drivescale.com or follow us on Twitter at @DriveScale_Inc.

SaveSave

SaveSave