Will Functional Programming Systems Save the Day for #HPC?

Will Functional Programming Systems Save the Day for #HPC?



First, a confession: I was a teenage FORTRAN programmer. And I saw the best minds of my generation destroyed by imperative programming. (Apologies to Ginsberg).

Now, I don’t consider myself an HPC expert, but I have been watching the field since the ’80s. It seems to me that the entire field of HPC systems and middleware has been about optimizing FORTRAN, instead of encouraging alternative approaches to problems.  Of course, there is huge value to keeping the “dusty decks” working, but at some point the future needs more attention than the past.

Functional programming is something I never really got the hang of, probably because too many brain pathways were destroyed by FORTRAN.  But ever since those old C.S. classes, smarter people than I have been telling me that it is a better way. Now I’m finally coming to believe.

Imperative programming is all about “DO THIS NOW.”  Functional programming is all about “GET THIS DONE.”  When taken to the parallel and distributed domains, there’s really no agreed upon NOW anymore, and for large systems, there’s no way a central intelligence, or a programmer, can keep track of the complete state of the system to micro-manage all  the entities.  This happens with people too – micro-managers don’t scale, but goal-setters do.

A fundamental problem with the imperative programming approach is that it relies on synchronous communication paradigms such as RPC, MPI, and barriers. These in turn need ever decreasing latency for system performance to improve.  The speed of light is a factor in some latency limits, but a much more relevant limit is the fact that the broader market just won’t pay for the extreme latency required to carry forward the traditional HPC architectures.  And, like it or not, the HPC world has become almost entirely dependent on commodity processor and network chips.  IBM’s Blue Gene/Q was probably the last HPC system built that did not rely on chips built for larger markets.  Intel may have a trick up its sleeve with its new Omni Path network tied to the Xeon Phi, but I suspect its proprietary nature will cause it to fail.

Another gotcha with the synchronous imperative model is in energy efficiency. In order for the thousands of processors to react to each other with lowest latency, they must all be in a “live,” high power state and preferably not need a context switch out of some other job.  This can burn a lot of energy compared with a system in which nodes may wake up more gradually and process larger streams of data at once. Power efficiency is the biggest problem to be addressed in reaching exascale HPC, and I think we’re off on the wrong foot.

So systems will most likely be built for the lowest cost and lowest energy to attain some throughput, not so that the scientist with the biggest budget can run his one job more quickly.

Functional programming enables easy parallel and distributed computing.  Large, asynchronous systems using pipelined data transfer are more naturally built with functional programming.

So where are the functional programming systems that will save the day for HPC?  They’re not here yet, but I’m betting that they’re coming from the Big Data industry.  Google’s MapReduce, that was the inspiration for Apache Hadoop, was in turn inspired by the map and reduce features in some functional languages.  Apache Spark takes things further by enabling full functional “programming in the large,” without the programmer worrying about exactly how, when, and where computation is carried out.   Apache Flink is a Spark alternative that, while somewhat similar, enables lower latency streaming of data. Both Spark and Flink are implemented in Scala, a hybrid functional/object-oriented language for the JVM.  As more advanced algorithms are developed in these analytical environments, I think we’ll see new code kernels which could also be used for HPC problems.

On a more scientific bent, most of the machine learning systems are built with big data technologies, rather than traditional HPC tech. Machine Learning is also consuming many thousands of GPUs for computing; I’d bet that already more GPUs are used in ML (by the cloud giants)  than in HPC.

So, fellow FORTRAN users, I think it’s time to admit that imperative programming is, like bell bottoms, DDT, and Ford Pintos, something best left for the history books of the ‘60s and ‘70s. If you’re lucky enough to fully grok functional programming, then I leave to you the future of HPC. Here are some links for provocative background reading:

About the Author:

Tom Lyon is a computing systems architect, a serial entrepreneur and a kernel hacker. Prior to founding DriveScale, Tom was founder and Chief Scientist of Nuova Systems, a start-up that led a new architectural approach to systems and networking. Nuova was acquired in 2008 by Cisco, whose highly successful UCS servers and Nexus switches are based on Nuova’s technology. He was also founder and CTO of two other technology companies. Netillion, Inc. was an early promoter of memory-over-network technology. At Ipsilon Networks, Tom invented IP Switching. Ipsilon was acquired by Nokia and provided the IP routing technology for many mobile network backbones. As employee #8 at Sun Microsystems, Tom was there from the beginning, where he contributed to the UNIX kernel, created the SunLink product family, and was one of the NFS and SPARC architects. He started his Silicon Valley career at Amdahl Corp., where he was a software architect responsible for creating Amdahl’s UNIX for mainframes technology. Tom holds numerous U.S. patents in system interconnects, memory systems, and storage. He received a B.S. in Electrical Engineering and Computer Science from Princeton University.

Leave A Comment