Next Generation HPC? What Spark, TensorFlow, and Chapel are teaching us about large-scale numerical computing

Jonathan Dursi, Ontario Institute for Cancer Research
March 2016

For years, the academic science and engineering community was almost alone in pursuing very large-scale numerical computing, and MPI – the 1990s-era message passing library – was the lingua franca for such work. But starting in the mid-2000s, others became interesting in large-scale computing on data. First internet-scale companies like Google and Yahoo! started performing fairly basic analytics tasks at enormous scale, and now many others are tackling increasingly complex and data-heavy machine-learning computations, which involve very familiar scientific computing tasks such as linear algebra, unstructured mesh decomposition, and numerical optimization. But these new communities have created programming environments which emphasize what we’ve learned about computer science and programmability since 1994 – with greater levels of abstraction and encapsulation, separating high-level computation from the low-level implementation details, and some in HPC are starting to notice. Slides and examples for each package along with a virtual machine which can be used for running them, will be available at https://github.com/ljdursi/Spark-Chapel-TF-UMich-2016