What are Modern Workloads?
They go by several names: Next Generation, Scale-Out or Cloud Native applications (even though they aren’t exclusive to the cloud). We like to call them “modern workloads”. They are relatively new on the IT landscape, and their very nature has had a profound impact on data center architectures.
Prominent examples of modern workloads come from Big Data, IoT, social and mobile applications (see examples on the right). Unlike traditional enterprise applications such as email or accounting software, modern workloads are more dynamic and unpredictable, and have data sets which often grow exponentially. Whereas traditional applications scale modestly and can be managed by increasing the number of virtual machines running on a server, modern workloads scale horizontally, with performance or capacity increases being addressed by adding servers. The dynamic and unpredictable nature of these applications means IT has to continuously provision and re-configure resources to handle the fluctuations in resources required.
Modern Workload Architectural Principles
The developers of the Big Data architectures at Google and Yahoo were looking to design a platform that could store and process a vast quantity of data at low cost. To achieve this, they developed several key principles around system architecture to support modern workloads such as Hadoop, Spark, Cassandra, etc.
Deploy Commodity Servers – Use the lowest cost storage and compute servers – what were traditionally called “PC Servers”. Every computer manufacturer has several versions of these systems, and the competition has resulted in them becoming low margin, low cost platforms. Building clusters of 100’s or 1000’s of these servers provides vast storage and compute capacity at the lowest possible cost.
Share Nothing, Parallel Architecture – Ensure no co-dependency between servers. If something is shared, even a minor resource, clusters would reach a point where the shared component would become a bottleneck preventing further scaling. Each node needs to be independent, equal, and parallel to all other nodes in the cluster. This allows for linear scaling from 10’s to 10,000’s of nodes.
Move the Application Processing to the Data – Keep the data close to the processor. Since the cheapest way to store data is in commodity servers with high performance connectivity to its local drives, the fastest way to process huge data stores is to use each of these nodes as a parallel computing unit to process the data stored on its own drives. This “data locality” is handled by a central control node that sends the applications out to the nodes, and each node then processes its own data.
Hadoop, Spark, Cassandra and other Big Data modern workloads are all based on these principles. By following them, enterprises can reap the same benefits first achieved by the hyperscale organizations that developed them.
That’s where Composable Infrastructure comes in. It supports all of these Big Data principles, and allows IT organizations to realize even greater cost savings with far greater agility to respond to the ever-changing nature of these modern workloads.