First, a confession: I was a teenage FORTRAN programmer. And I saw the best minds of my generation destroyed by imperative programming. (Apologies to Ginsberg).
Now, I don’t consider myself an HPC expert, but I have been watching the field since the ’80s. It seems to me that the entire field of HPC systems and middleware has been about optimizing FORTRAN, instead of encouraging alternative approaches to problems. Of course, there is huge value to keeping the “dusty decks” working, but at some point the future needs more attention than the past.
Functional programming is something I never really got the hang of, probably because too many brain pathways were destroyed by FORTRAN. But ever since those old C.S. classes, smarter people than I have been telling me that it is a better way. Now I’m finally coming to believe.
Imperative programming is all about “DO THIS NOW.” Functional programming is all about “GET THIS DONE.” When taken to the parallel and distributed domains, there’s really no agreed upon NOW anymore, and for large systems, there’s no way a central intelligence, or a programmer, can keep track of the complete state of the system to micro-manage all the entities. This happens with people too – micro-managers don’t scale, but goal-setters do.
A fundamental problem with the imperative programming approach is that it relies on synchronous communication paradigms such as RPC, MPI, and barriers. These in turn need ever decreasing latency for system performance to improve. The speed of light is a factor in some latency limits, but a much more relevant limit is the fact that the broader market just won’t pay for the extreme latency required to carry forward the traditional HPC architectures. And, like it or not, the HPC world has become almost entirely dependent on commodity processor and network chips. IBM’s Blue Gene/Q was probably the last HPC system built that did not rely on chips built for larger markets. Intel may have a trick up its sleeve with its new Omni Path network tied to the Xeon Phi, but I suspect its proprietary nature will cause it to fail.
Another gotcha with the synchronous imperative model is in energy efficiency. In order for the thousands of processors to react to each other with lowest latency, they must all be in a “live,” high power state and preferably not need a context switch out of some other job. This can burn a lot of energy compared with a system in which nodes may wake up more gradually and process larger streams of data at once. Power efficiency is the biggest problem to be addressed in reaching exascale HPC, and I think we’re off on the wrong foot.
So systems will most likely be built for the lowest cost and lowest energy to attain some throughput, not so that the scientist with the biggest budget can run his one job more quickly.
Functional programming enables easy parallel and distributed computing. Large, asynchronous systems using pipelined data transfer are more naturally built with functional programming.
So where are the functional programming systems that will save the day for HPC? They’re not here yet, but I’m betting that they’re coming from the Big Data industry. Google’s MapReduce, that was the inspiration for Apache Hadoop, was in turn inspired by the map and reduce features in some functional languages. Apache Spark takes things further by enabling full functional “programming in the large,” without the programmer worrying about exactly how, when, and where computation is carried out. Apache Flink is a Spark alternative that, while somewhat similar, enables lower latency streaming of data. Both Spark and Flink are implemented in Scala, a hybrid functional/object-oriented language for the JVM. As more advanced algorithms are developed in these analytical environments, I think we’ll see new code kernels which could also be used for HPC problems.
On a more scientific bent, most of the machine learning systems are built with big data technologies, rather than traditional HPC tech. Machine Learning is also consuming many thousands of GPUs for computing; I’d bet that already more GPUs are used in ML (by the cloud giants) than in HPC.
So, fellow FORTRAN users, I think it’s time to admit that imperative programming is, like bell bottoms, DDT, and Ford Pintos, something best left for the history books of the ‘60s and ‘70s. If you’re lucky enough to fully grok functional programming, then I leave to you the future of HPC. Here are some links for provocative background reading: