Modern Workloads Need a Scalable, Distributed Infrastructure
Modern workloads, such as Hadoop and other big data technologies (see Modern Workloads Require a New Approach), are typically managing datasets that are too large to be processed on a single computer. These workloads are deployed on scale-out infrastructures which support a distributed processing framework. Scale-out architectures are based on common off-the-shelf servers (COTS servers) with internal drives (Direct Attached Storage or DAS), and data centers deploy anywhere from dozens to 10,000’s of them to achieve the scale required.
Scale-out architectures offer the lowest cost option for these big data applications since the servers are standardized, available from several manufacturers and have become commodities. Scale-out also supports the “data locality” principle, which is necessary for the scale and performance required by modern workloads.
Composable Infrastructure (SCI) is a next generation data center architecture that delivers significant advantages over standard scale-out infrastructures, as we’ll describe below. But what are the issues with these scale-out architectures?
Scale-Out Lacks the Agility of the Cloud
Despite the advantages of a scale-out architecture for big data applications, it lacks a key capability that is one of the drivers behind the growth of cloud computing – the ability to easily and quickly scale up or down the compute and storage resources needed for each workload.
Cloud-like agility is particularly important for modern workloads for several reasons:
- Its difficult to ascertain the correct footprint up-front when planning the deployment of a new workload. To avoid further delays from re-configuring systems, IT teams will often over-provision resources for an application, leading to wasteful spending and poor utilization levels.
- Modern workloads are highly dynamic, and even if you guessed reasonably well up-front, you’ll soon find that the workload will require more or less resources than initially deployed. That means you’re either scrambling to upgrade the cluster, or you have valuable resources that are left idle.
- Data growth is unpredictable and in many cases it can grow exponentially. IT teams are constantly stressed to respond as quickly as they can to meet the needs of the business. But provisioning new servers is not a quick task and can result in missed SLAs to the organization.
- As the value of big data grows within enterprises, so does the number and size of the workloads being deployed. This leads to multiple application clusters being created, increasing the inflexibility of the infrastructure as resources become “siloed” in separate clusters.
- Scale-out architectures often contain 1000’s of disk drives, and failures are not just anticipated, they are expected. Responding to server or disk failures is a manual effort, leading to downtime and slow application performance until the repair is completed.
Composable Infrastructure Explained
You may have heard the term, “Infrastructure as Code”. That’s essentially what Composable Infrastructure is. It allows you to create the infrastructure you want by software command (either GUI or API). Let’s take a closer look.
To employ SCI, you start by disaggregating the components of a physical server into separate pools of resources. For example, rather than using standard servers with internal drives as the fundamental component of a scale-out infrastructure, you would use disk-lite servers (only enough disk to boot) to create a compute pool, and JBOD’s (Just a Bunch Of Disks) for a storage pool.
With SCI, you can attach any drive to any server, effectively “composing” servers and clusters that are optimized for the needs of a particular workload. If a workload needs additional compute or storage resources, its as easy as a few keyboard clicks to add more to the cluster. Once a workload is complete, these resources can be returned to the pool for use by other applications. This composition and re-composition happens under software control, which is fast and doesn’t require anyone to physically touch or re-configure any of the equipment. Resources are no longer trapped in separate silos.
Since the servers and drives are physically separated in a SCI architecture, there must be a way to physically connect them together. This comes in the form of an “adapter”, such as a SAS-to-Ethernet bridge. The adapter would connect to the JBODs via the SAS ports, and to the top-of-rack switch using Ethernet. Since the servers are also connected to the top-of-rack switch, you now have a fabric which can connect any drive to any server. If this fabric is sufficiently fast (e.g. 10GigE for the ToR switch), the performance will be identical to a bare metal server with Direct Attached Storage. Equally important, everything is indistinguishable from a standard server to the software running on it, so no changes are required to the application stack. Server composition would happen within a rack (using the adapter and ToR switch), while clusters can span multiple racks.
Implementing SCI doesn’t require a “fork-lift” upgrade where you have to replace your existing servers to take advantage of it. Rather, you simply start deploying new racks using the SCI architecture, and they work perfectly with your existing, standard servers. You can even combine standard servers and “composed” servers in the same cluster. As you upgrade your existing equipment as part of a lifecycle refresh, you would replace them with the disaggregated components.
Composable Infrastructure vs. Virtualization
How does Composable Infrastructure compare with “server virtualization”? Both SCI and server virtualization help IT optimize data center resource utilization, but they are very different in how they achieve that. One works from the top down, while the other from the bottom up.
Server virtualization lets you effectively “slice” a physical server into multiple virtual servers, or “virtual machines” (VM’s). This consolidation is great for applications that easily fit on a single server, or more accurately, on a fraction of a server. The benefit is that you can reduce the number of servers required to run these applications, and you get more value from each of the servers.
Composable Infrastructure, on the other hand, works by combining separate compute and storage resources into “physical” servers and clusters under software control. This allows you to optimize the servers or clusters for each workload, and adjust them quickly and easily as needed. The result is a much more efficient use of data center resources.
Another difference is that virtualization’s abstraction layer adds significant overheard, so a good amount of your processing power is used just for enabling virtualization, and not for your applications. With SCI, you are composing physical servers and clusters that provide bare metal performance for the workloads you run on them – no processing power is lost in the process.