An optimized disaggregated data center consists of thin (“diskless”) clients and dense storage nodes (that are increasingly JBOFs in our solid state world), all connected with a high speed data fabric – most likely Ethernet. The addition of GPU disaggregation to the mix makes the disaggregated data center not only cost effective but powerful enough for any data-driven application you can throw at it.
But it’s a journey to an optimized disaggregated data center, and customers exploring the approach for the first time wonder how to begin.
As is often the case in technology evaluation, an IT architect will stage a Proof of Concept (POC) to explore the fundamental capabilities of a new solution. Composable Infrastructure is new enough to most such that a POC makes a lot of sense. It is unlikely that optimized platforms for disaggregated infrastructure are present in a typical data center. For the compute node any common Linux platform used for the target application can be used for evaluation purposes (it’s just likely not the platform you would buy moving forward for composable infrastructure). Storage is more problematic as an intelligent networked JBOD or JBOF is likely unavailable in the typical data center. What can you do?
DriveScale engineering extended work, first done in partnership with Western Digital to support their OpenFlex initiative, to solve this problem. Western Digital OpenFlex storage, configured through their OpenFlex APIs, is used by a storage control proxy (DS Proxy), provided by DriveScale, to plumb an end-to-end NVME connection from the compute node to the storage. In a typical scenario the DriveScale agent resides in the storage target (as is the case with the Seagate Exos storage platforms). In the case of OpenFlex storage the proxy agent is outside the storage target (a proxy).
Once we had the proxy capability defined, it was a small step to use the same approach to integrate legacy enterprise storage into our composable solution using the respective storage vendor’s proprietary provisioning API.
It is important to note that the proxy (likely residing on the HA composer nodes themselves) is not in the data path for normal I/O between the compute and storage nodes. Instead, the proxy translates the storage configuration specified by the DriveScale orchestration manager into the operations on the Pure Storage array necessary to create the requested storage volumes and make them available to the configured servers, which connect automatically through the actions of their DriveScale agents. The proxy and server agents then stay out of the way of the data flow until another configuration change is requested.
There is plenty of typical enterprise storage deployed in the data center from vendors such as NetApp or Pure Storage that can be used for a POC. In the example below, we’ll show how a Pure Storage FlashArray can be used with the DriveScale solution.
The first diagram above shows the logical view of the components deployed in the data center, including the Pure Storage FlashArray. The following diagram shows the Pure Storage view of the FlashArray. The proxy is connected to the Pure Storage arrays you want to use to dynamically instantiate LUNs (volumes in Pure Storage parlance) to attach to compute nodes.
A simple way to create a cluster using the DriveScale solution is to use the DriveScale Composer. For an application cluster create, the DriveScale Composer will automatically allocate compute and storage from the available pools. For a POC typically a few nodes of compute are assigned to DriveScale use, and the Pure Storage FlashArray is configured as the only storage target for the composer.
In the above example, we create a single compute node with four SSDs (slices) initialized with the XFS file system (called out in green rectangles). The DriveScale composer will automatically allocate right-sized LUNs (called slices in DriveScale terminology) from the Pure Storage FlashArray and dynamically connect them to the compute nodes. An inspection of one of the configured Linux clients shows a list of four dynamically instantiated LUNs in the node’s storage list.
Inspecting the single node cluster we created in the DriveScale Composer shows the “pure” cluster with one compute node with four SSD slices attached.
What’s fun about the DriveScale solution using the PureStorage FlashArray is that the LUNs are completely dynamic, right-sized and have all the properties of a Pure Storage array volume (on array deduplication, compression, thin provisioning).
Power users will want to deploy storage in production environments using our APIs for scripting. The Pure Storage FlashArray is treated like any disaggregated target by our software, so its use is completely scriptable.
As I leave you with the Pure Storage panel showing the result of the previous script that dynamically instantiates three right-sized LUNs (slices or volumes), whose names start with ‘vqn-2013-com-drivescale’. I want to make one more observation. The DriveScale proxy solution is also a way to provide a production transition strategy for an organization by allowing the use of any existing enterprise storage in a disaggregated data center as you move forward to optimize your data center in future tech refreshes that optimize your approach.
Interested in seeing the demo? Watch the below video to see Jim Hanko, Fellow Engineer at DriveScale walk you through first hand.