S is for Scale Out

S is for Scale Out


Picking up where we left off in this four-part series, we move into part three to explore Scale-out.  (For a refresher visit Part 1 and Part 2.)

“Scale-out” exists in relation to “scale up”, the latter being the traditional method of adding performance and capacity in the data center. Simply put, if you want more, make it bigger – more CPU power and memory and storage in a single node. For an example of scale-up today, think Oracle database on an Oracle SPARC M8-8 Server. Lots of cores in a single physical domain to run a single application. The Oracle database is an exquisitely engineered piece of enterprise software designed to exploit the many cores found on mainframe-equivalent data center servers. Ubiquitous virtualization arose due to the collision of the inertia-against-change of large, complex, single system applications and the CPU clock wall being hit in the mid-2000s

Scale Out

Initially, virtualization was being used to host multiple individual applications on the now many-core platforms in a secure fashion. But increasingly, applications were designed to span multiple low-end platforms where they cooperated and presented as a single application to the user. This method of large application development was called “scale-out”. When we say “scale-out” for applications it simply means you add additional performance and capacity by creating more instances of the application.

I want to distinguish the emergence of scale-out applications from clustered applications. Clustered applications have been around since the 1990’s at least, and had limited scale from two to fournodes, and although rare, sometimes eight nodes, and a restricted definition of fault tolerance.

Scale-out applications as we know them today emerged from the hyperscalers. Scale-out applications are designed as cooperating instances that run on a large set of commodity-off-the-shelf (COTS) hardware. The first significant scale-out application that saw widespread adoption was the map-reduce application Hadoop, which had its beginnings at Google back in the early ‘00s. While map-reduce is the core function of Hadoop, what we call Apache Hadoop is actually a collection of technologies, including what I would term an application-specific storage layer called HDFS.

storage layer called HDFS

Today there are many production examples of scale-out applications. They run on COTS servers in the data center or a COTS platform like Amazon Web Services (AWS) in the cloud. The term can also be applied to storage solutions, such as Ceph, where the scale-out application is the storage service itself.

In clustered applications and clustered platforms of the past, the fault-tolerant model was narrowly and strictly defined. In a sense, cluster applications were simpler to define, if fragile to deploy. Scale-out applications are very much characterized by the failure and recovery semantics being defined an implemented in the application itself, with each application adopting a slightly different model depending on its requirements. And the failover and recovery model for scale-out is much more relaxed. Fail overtimes are often non-existent, in that no compute node is distinguished and data is typically triple replicated and a failed node simply results in a rebuild of the data onto a new node or redistribution of data to existing nodes to reach triple replica protection again.

A subject for a future blog post is a discussion of the control plane versus the data plane in deploying scale-out applications today. I touched on this in my previous post on Kubernetes, but it bears a closer look.

The last blog in this series is D is for DriveScale, where I will describe how DriveScale technology fits into all of this.

About the Author:

Brian Pawlowski has years of experience in building technologies and leading teams in high-growth environments at global technology companies. He is currently the CTO at DriveScale. Previously, Brian served as Vice President and Chief Architect at Pure Storage, focused on improving the user experience for the all-flash storage platform provider’s rapidly growing customer base. As CTO at storage pioneer NetApp, Brian led the first SAN product effort, founded NetApp labs to collaborate with universities and research centers on new directions in data center architectures.

Leave A Comment