What are Modern Workloads?
They go by several names: Next Generation, Scale-Out or Cloud Native applications (even though they are not exclusive to the cloud). Although relatively new on the IT landscape, they have had a profound impact on data center architectures.
Prominent examples of modern workloads come from Big Data, IoT, social and mobile applications. Unlike traditional enterprise applications such as email or accounting software, modern workloads are more dynamic and unpredictable and have data sets which often grow exponentially.
Whereas traditional applications scale modestly and can be managed by increasing the number of virtual machines running on a server, modern workloads scale horizontally, with performance or capacity increases being addressed by adding servers. The dynamic and unpredictable nature of these applications means IT must continuously provision and re-configure resources to handle the fluctuations in resources required.
Modern Workload Architectural Principles
Looking to design a platform that could store and process a vast quantity of data at a low cost, the developers of the Big Data architectures at Google and Yahoo developed several key principles around system architecture to support modern workloads such as Hadoop, Spark, Cassandra, and other frameworks.
5 Key Modern Workload Principles
Use the lowest cost storage and compute servers, traditionally called “PC Servers.”
Every computer manufacturer has several versions of these systems; the resulting competition has turned them into low-margin, low cost platforms. Building clusters of hundreds or thousands of these servers provides vast storage and compute capacity at the lowest possible cost.
Ensure no co-dependency between servers.
If something is shared, even a minor resource, clusters would reach a point where the shared component would become a bottleneck preventing further scaling. Each node must be independent, equal, and parallel to all other nodes in the cluster. This allows for linear scaling from tens to tens-of-thousands of nodes.
Since the cheapest way to store data is in commodity servers with high performance connectivity to its local drives, the fastest way to process huge data stores is to use each of these nodes as a parallel computing unit to process the data stored on its own drives. This “data locality” is handled by a central control node that sends the applications out to the nodes, and each node then processes its own data.
Both Map Reduce and in-memory applications, such as Spark, keep the majority of data transfers local. Only the results from each level of processing are moved to other nodes for further processing. This keeps bandwidth requirements to a minimum between nodes and, in particular, between racks.
Move sophisticated resiliency and efficiency functions out of the hardware and “up” into the software layers.
Big Data deals with data resiliency by triplicating the data rather than by using RAID or other data recovery schemes. Management of the triple copies is handled by the file system itself, rather than the storage subsystem. This may seem less efficient, but it enables the use of commodity drives rather than sophisticated storage arrays. Commodity drives cost orders of magnitudes less than storage arrays, resulting in a system that is vastly less expensive. In addition, the chance of a double disk failure (from which storage arrays cannot recover) becomes a near certainty in very large data stores. Triplicated (and in some cases, quadruplicated) data is much more robust against failure than RAID arrays.
Hadoop, Spark, Cassandra and other Big Data modern workloads are all based on these principles. By following them, enterprises can reap the same benefits first achieved by the hyperscale organizations that developed them.
Modern Workloads Need a Scalable, Distributed Infrastructure
Modern workloads, such as Hadoop and other big data technologies, are typically managing datasets that are too large to be processed on a single computer. These workloads are deployed on scale-out infrastructures which support a distributed processing framework.
Standard scale-out architectures are based on common, off-the-shelf (COTS) servers with Direct Attached Storage (DAS) internal drives. Data centers deploy up to 10,000s of COTS servers to achieve the scale required. Standard scale-out architectures offer the lowest cost option for these big data applications since COTS servers are standardized, commodity equipment available from several manufacturers. Scale-out architectures also support the “data locality” principle, which is necessary to meet the scale and performance needs of modern workloads.
Modern Workloads Need Cloud-like Agility
Despite the advantages of standard scale-out architectures for modern workloads, a key capability driving the growth of cloud computing is missing – the ability to easily and quickly scale up or down the compute and storage resources needed for each workload.
Cloud-like agility is particularly important for modern workloads for several reasons:
- It is difficult to ascertain the correct footprint up-front when planning the deployment of a new workload. To avoid further delays from re-configuring systems, IT teams will often over-provision resources for an application, leading to wasteful spending and poor utilization levels.
- Modern workloads are highly dynamic. Even when initial assessments are reasonably accurate, the workload will eventually require more or less resources than initially deployed. That will require either rushing to upgrade the cluster or leaving valuable resources idle.
- Data growth is unpredictable and in many cases it can grow exponentially. IT teams are constantly stressed to respond as quickly as possible to meet the business needs. Provisioning new servers is not a quick task and results in missed SLAs to the organization.
- As the value of big data grows within enterprises, so does the number and size of the workloads being deployed. This leads to multiple application clusters being created, increasing the inflexibility of the infrastructure as resources become “siloed” in separate clusters.
- Scale-out architectures often contain thousands of disk drives, and failures are not just anticipated, they are expected. Responding to server or disk failures is a manual effort, leading to downtime and slow application performance until the repair is completed.