SC18 Proceedings

Algorithms

Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass

Biology Applications

Extreme Scale De Novo Metagenome Assembly

Best Paper Finalists

Evangelos Georganas (Intel Corporation) and Rob Egan, Steven Hofmeyr, Eugene Goltsman, Bill Arndt, Andrew Tritt, Aydin Buluc, Leonid Oliker, and Katherine Yelick (Lawrence Berkeley National Laboratory)

Abstract

pdf

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting

Tony C. Pan (Georgia Institute of Technology, School of Computational Science and Engineering); Sanchit Misra (Intel Corporation, Parallel Computing Lab); and Srinivas Aluru (Georgia Institute of Technology, School of Computational Science and Engineering)

Abstract

pdf

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight

Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, and Weiguo Liu (Shandong University); Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, and Dexun Chen (Tsinghua University); Xiangxu Meng (Shandong University); and Guangwen Yang (Tsinghua University)

Abstract

pdf

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass

Algorithms on Sparse Data

HiCOO: Hierarchical Storage of Sparse Tensors

Best Student Paper Finalists

Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)

Abstract

pdf

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures

Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)

Abstract

pdf

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution

Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Physics and Tensor Applications

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight

Bingwei Chen, Haohuan Fu, Yanwen Wei, and Conghui He (Tsinghua University; National Supercomputing Center, Wuxi); Wenqiang Zhang (University of Science and Technology of China); Yuxuan Li (Tsinghua University; National Supercomputing Center, Wuxi); Wubin Wan and Wei Zhang (National Supercomputing Center, Wuxi); Lin Gan (Tsinghua University; National Supercomputing Center, Wuxi); Wei Zhang and Zhenguo Zhang (Southern University of Science and Technology, China); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and Xiaofei Chen (Southern University of Science and Technology, China)

Abstract

pdf

Accelerating Quantum Chemistry with Vectorized and Batched Integrals

Hua Huang and Edmond Chow (Georgia Institute of Technology)

Abstract

pdf

High-Performance Dense Tucker Decomposition on GPU Clusters

Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)

Abstract

pdf

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

Resilience III: GPUs

Optimizing Software-Directed Instruction Replication for GPU Error Detection

Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)

Abstract

pdf

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)

Abstract

pdf

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods

Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Astrophysics Applications

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows

Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)

Abstract

pdf

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver

Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)

Abstract

pdf

Return to Top

Applications

Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass

Biology Applications

Extreme Scale De Novo Metagenome Assembly

Best Paper Finalists

Abstract

pdf

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting

Abstract

pdf

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Physics and Tensor Applications

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight

Abstract

pdf

Accelerating Quantum Chemistry with Vectorized and Batched Integrals

Hua Huang and Edmond Chow (Georgia Institute of Technology)

Abstract

pdf

High-Performance Dense Tucker Decomposition on GPU Clusters

Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)

Abstract

pdf

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass

Graph Algorithms and Systems

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees

Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Adaptive Anonymization of Data with b-Edge Covers

Arif Khan (Pacific Northwest National Laboratory), Krzysztof Choromanski (Google LLC), Alex Pothen and S M Ferdous (Purdue University), and Mahantesh Halappanavar and Antonino Tumeo (Pacific Northwest National Laboratory)

Abstract

pdf

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU

Martin Winter and Daniel Mlakar (Graz University of Technology); Rhaleb Zayer and Hans-Peter Seidel (Max Planck Institute for Informatics); and Markus Steinberger (Graz University of Technology, Max Planck Institute for Informatics)

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Astrophysics Applications

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows

Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)

Abstract

pdf

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver

Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)

Abstract

pdf

Return to Top

Architectures

Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass

Next-Generation Networking

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage

Best Paper Finalists

Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)

Abstract

pdf

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences

Qiao Xiang (Yale University); J. Jensen Zhang, X. Tony Wang, and Y. Jace Liu (Tongji University); Chin Guok (Lawrence Berkeley National Laboratory); Franck Le (IBM); John MacAuley (Lawrence Berkeley National Laboratory); Harvey Newman (California Institute of Technology); and Y. Richard Yang (Yale University)

Abstract

pdf

Light-Weight Protocols for Wire-Speed Ordering

Hans Eberle and Larry Dennison (Nvidia Corporation)

Abstract

pdf

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

MPI Optimization and Characterization

Cooperative Rendezvous Protocols for Improved Performance and Overlap

Best Student Paper Finalists

S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)

Abstract

pdf

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)

Abstract

pdf

Characterization of MPI Usage on a Production Supercomputer

Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Large Scale System Deployments

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems

Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)

Abstract

pdf

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience

Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)

Abstract

pdf

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)

Abstract

pdf

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

Resilience III: GPUs

Optimizing Software-Directed Instruction Replication for GPU Error Detection

Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)

Abstract

pdf

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Abstract

pdf

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods

Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)

Abstract

pdf

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)

Abstract

pdf

A Year in the Life of a Parallel File System

Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)

Abstract

pdf

Return to Top

Clouds and Distributed Computing

Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass

Data and Storage

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition

Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)

Abstract

pdf

BESPOKV: Application Tailored Scale-Out Key-Value Stores

Ali Anwar (IBM), Yue Cheng (George Mason University), Hai Huang (IBM), Jingoo Han (Virginia Tech), Hyogi Sim (Oak Ridge National Laboratory), Dongyoon Lee (Virginia Tech), Fred Douglis (Perspecta Labs), and Ali R. Butt (Virginia Tech)

Abstract

pdf

Scaling Embedded In Situ Indexing with DeltaFS

Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, and Garth A. Gibson (Carnegie Mellon University) and Bradley W. Settlemyer, Gary Grider, and Fan Guo (Los Alamos National Laboratory)

Abstract

pdf

Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass

Clouds and Distributed Computing

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments

Georgios Andreadis (Delft University of Technology, Vrije University Amsterdam); Laurens Versluis (Vrije University Amsterdam); Fabian Mastenbroek (Delft University of Technology); and Alexandru Iosup (Vrije University Amsterdam, Delft University of Technology)

Abstract

pdf

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters

Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)

Abstract

pdf

A Lightweight Model for Right-Sizing Master-Worker Applications

Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)

Abstract

pdf

Return to Top

Compiler Analysis and Optimization

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Return to Top

Computational Biology

Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass

Biology Applications

Extreme Scale De Novo Metagenome Assembly

Best Paper Finalists

Abstract

pdf

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting

Abstract

pdf

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight

Abstract

pdf

Return to Top

Computational Physics

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Physics and Tensor Applications

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight

Abstract

pdf

Accelerating Quantum Chemistry with Vectorized and Batched Integrals

Hua Huang and Edmond Chow (Georgia Institute of Technology)

Abstract

pdf

High-Performance Dense Tucker Decomposition on GPU Clusters

Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Astrophysics Applications

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows

Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)

Abstract

pdf

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver

Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)

Abstract

pdf

Return to Top

Cosmology

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Return to Top

Data Analytics

Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass

Next-Generation Networking

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage

Best Paper Finalists

Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)

Abstract

pdf

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences

Abstract

pdf

Light-Weight Protocols for Wire-Speed Ordering

Hans Eberle and Larry Dennison (Nvidia Corporation)

Abstract

pdf

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Performance Optimization Studies

Many-Core Graph Workload Analysis

Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)

Abstract

pdf

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading

Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)

Abstract

pdf

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations

Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Return to Top

Data Management

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Abstract

pdf

A Year in the Life of a Parallel File System

Abstract

pdf

Return to Top

Deep Learning

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Return to Top

File Systems

Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass

Data and Storage

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition

Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)

Abstract

pdf

BESPOKV: Application Tailored Scale-Out Key-Value Stores

Abstract

pdf

Scaling Embedded In Situ Indexing with DeltaFS

Abstract

pdf

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Abstract

pdf

A Year in the Life of a Parallel File System

Abstract

pdf

Return to Top

Floating Point

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Return to Top

GPUs

Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass

Resilience

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan

Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)

Abstract

pdf

FlipTracker: Understanding Natural Error Resilience in HPC Applications

Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)

Abstract

pdf

Doomsday: Predicting Which Node Will Fail When on Supercomputers

Best Student Paper Finalists

Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)

Abstract

pdf

Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Non-Volatile Memory

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs

Kai Wu, Jie Ren, and Dong Li (University of California, Merced)

Abstract

pdf

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access

Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract

pdf

Siena: Exploring the Design Space of Heterogeneous Memory Systems

Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

Resilience III: GPUs

Optimizing Software-Directed Instruction Replication for GPU Error Detection

Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)

Abstract

pdf

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Abstract

pdf

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods

Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)

Abstract

pdf

Return to Top

Graph Algorithms

Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass

Algorithms on Sparse Data

HiCOO: Hierarchical Storage of Sparse Tensors

Best Student Paper Finalists

Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)

Abstract

pdf

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures

Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)

Abstract

pdf

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution

Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)

Abstract

pdf

Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass

Graph Algorithms and Systems

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees

Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Adaptive Anonymization of Data with b-Edge Covers

Abstract

pdf

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU

Abstract

pdf

Return to Top

I/O

Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass

Data and Storage

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition

Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)

Abstract

pdf

BESPOKV: Application Tailored Scale-Out Key-Value Stores

Abstract

pdf

Scaling Embedded In Situ Indexing with DeltaFS

Abstract

pdf

Return to Top

Linear Algebra

Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass

Algorithms on Sparse Data

HiCOO: Hierarchical Storage of Sparse Tensors

Best Student Paper Finalists

Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)

Abstract

pdf

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures

Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)

Abstract

pdf

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution

Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)

Abstract

pdf

Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

Programming Systems Tools

Dynamic Data Race Detection for OpenMP Programs

Yizi Gu and John Mellor-Crummey (Rice University)

Abstract

pdf

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)

Abstract

pdf

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution

Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)

Abstract

pdf

Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

Resilience III: GPUs

Optimizing Software-Directed Instruction Replication for GPU Error Detection

Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)

Abstract

pdf

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Abstract

pdf

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods

Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)

Abstract

pdf

Return to Top

Machine Learning

Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass

Algorithms on Sparse Data

HiCOO: Hierarchical Storage of Sparse Tensors

Best Student Paper Finalists

Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)

Abstract

pdf

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures

Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)

Abstract

pdf

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution

Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Return to Top

Memory

Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Non-Volatile Memory

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs

Kai Wu, Jie Ren, and Dong Li (University of California, Merced)

Abstract

pdf

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access

Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract

pdf

Siena: Exploring the Design Space of Heterogeneous Memory Systems

Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

Programming Systems Tools

Dynamic Data Race Detection for OpenMP Programs

Yizi Gu and John Mellor-Crummey (Rice University)

Abstract

pdf

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)

Abstract

pdf

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution

Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)

Abstract

pdf

Return to Top

MPI

Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

MPI Optimization and Characterization

Cooperative Rendezvous Protocols for Improved Performance and Overlap

Best Student Paper Finalists

S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)

Abstract

pdf

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)

Abstract

pdf

Characterization of MPI Usage on a Production Supercomputer

Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)

Abstract

pdf

Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

Programming Systems Tools

Dynamic Data Race Detection for OpenMP Programs

Yizi Gu and John Mellor-Crummey (Rice University)

Abstract

pdf

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)

Abstract

pdf

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution

Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)

Abstract

pdf

Return to Top

NVRAM

Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Non-Volatile Memory

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs

Kai Wu, Jie Ren, and Dong Li (University of California, Merced)

Abstract

pdf

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access

Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract

pdf

Siena: Exploring the Design Space of Heterogeneous Memory Systems

Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)

Abstract

pdf

Return to Top

Networks

Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass

Next-Generation Networking

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage

Best Paper Finalists

Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)

Abstract

pdf

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences

Abstract

pdf

Light-Weight Protocols for Wire-Speed Ordering

Hans Eberle and Larry Dennison (Nvidia Corporation)

Abstract

pdf

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass

Resource Management and Interference

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management

Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)

Abstract

pdf

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters

Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract

pdf

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing

Best Student Paper Finalists

Staci A. Smith, Clara E. Cromey, and David K. Lowenthal (University of Arizona); Jens Domke (Tokyo Institute of Technology); and Nikhil Jain, Jayaraman J. Thiagarajan, and Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract

pdf

Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

MPI Optimization and Characterization

Cooperative Rendezvous Protocols for Improved Performance and Overlap

Best Student Paper Finalists

S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)

Abstract

pdf

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)

Abstract

pdf

Characterization of MPI Usage on a Production Supercomputer

Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Large Scale System Deployments

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems

Abstract

pdf

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience

Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)

Abstract

pdf

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Abstract

pdf

Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

Resilience III: GPUs

Optimizing Software-Directed Instruction Replication for GPU Error Detection

Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)

Abstract

pdf

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Abstract

pdf

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods

Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)

Abstract

pdf

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Abstract

pdf

A Year in the Life of a Parallel File System

Abstract

pdf

Return to Top

OpenMP

Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass

Performance and Energy Analysis

A Parallelism Profiler with What-If Analyses for OpenMP Programs

Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)

Abstract

pdf

Energy Efficiency Modeling of Parallel Applications

Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)

Abstract

pdf

HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor

John D. McCalpin (University of Texas, Texas Advanced Computing Center)

Abstract

pdf

Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

Programming Systems Tools

Dynamic Data Race Detection for OpenMP Programs

Yizi Gu and John Mellor-Crummey (Rice University)

Abstract

pdf

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)

Abstract

pdf

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution

Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)

Abstract

pdf

Return to Top

Parallel Programming Languages, Libraries, and Models

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Return to Top

Performance

Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass

Performance and Energy Analysis

A Parallelism Profiler with What-If Analyses for OpenMP Programs

Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)

Abstract

pdf

Energy Efficiency Modeling of Parallel Applications

Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)

Abstract

pdf

HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor

John D. McCalpin (University of Texas, Texas Advanced Computing Center)

Abstract

pdf

Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Performance Optimization Studies

Many-Core Graph Workload Analysis

Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)

Abstract

pdf

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading

Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)

Abstract

pdf

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations

Abstract

pdf

Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

MPI Optimization and Characterization

Cooperative Rendezvous Protocols for Improved Performance and Overlap

Best Student Paper Finalists

S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)

Abstract

pdf

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)

Abstract

pdf

Characterization of MPI Usage on a Production Supercomputer

Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)

Abstract

pdf

Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Non-Volatile Memory

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs

Kai Wu, Jie Ren, and Dong Li (University of California, Merced)

Abstract

pdf

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access

Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract

pdf

Siena: Exploring the Design Space of Heterogeneous Memory Systems

Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)

Abstract

pdf

Paper · Performance, Resiliency, Tools, Tech Program Reg Pass

Resilience II

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo

Scott Levy and Kurt B. Ferreira (Sandia National Laboratories), Nathan DeBardeleben (Los Alamos National Laboratory), Taniya Siddiqua and Vilas Sridharan (Advanced Micro Devices Inc), and Elisabeth Baseman (Los Alamos National Laboratory)

Abstract

pdf

Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities

Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)

Abstract

pdf

Evaluating and Accelerating High-Fidelity Error Injection for HPC

Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)

Abstract

pdf

Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Large Scale System Deployments

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems

Abstract

pdf

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience

Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)

Abstract

pdf

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Abstract

pdf

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Return to Top

Power

Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass

Performance and Energy Analysis

A Parallelism Profiler with What-If Analyses for OpenMP Programs

Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)

Abstract

pdf

Energy Efficiency Modeling of Parallel Applications

Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)

Abstract

pdf

HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor

John D. McCalpin (University of Texas, Texas Advanced Computing Center)

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Return to Top

Precision

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Return to Top

Programming Systems

Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Performance Optimization Studies

Many-Core Graph Workload Analysis

Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)

Abstract

pdf

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading

Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)

Abstract

pdf

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations

Abstract

pdf

Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

MPI Optimization and Characterization

Cooperative Rendezvous Protocols for Improved Performance and Overlap

Best Student Paper Finalists

S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)

Abstract

pdf

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)

Abstract

pdf

Characterization of MPI Usage on a Production Supercomputer

Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

Programming Systems Tools

Dynamic Data Race Detection for OpenMP Programs

Yizi Gu and John Mellor-Crummey (Rice University)

Abstract

pdf

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)

Abstract

pdf

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution

Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Return to Top

Resiliency

Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass

Resilience

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan

Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)

Abstract

pdf

FlipTracker: Understanding Natural Error Resilience in HPC Applications

Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)

Abstract

pdf

Doomsday: Predicting Which Node Will Fail When on Supercomputers

Best Student Paper Finalists

Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)

Abstract

pdf

Paper · Performance, Resiliency, Tools, Tech Program Reg Pass

Resilience II

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo

Abstract

pdf

Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities

Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)

Abstract

pdf

Evaluating and Accelerating High-Fidelity Error Injection for HPC

Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)

Abstract

pdf

Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

Resilience III: GPUs

Optimizing Software-Directed Instruction Replication for GPU Error Detection

Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)

Abstract

pdf

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Abstract

pdf

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods

Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)

Abstract

pdf

Return to Top

Resource Management

Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass

Resource Management and Interference

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management

Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)

Abstract

pdf

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters

Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract

pdf

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing

Best Student Paper Finalists

Abstract

pdf

Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass

Clouds and Distributed Computing

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments

Abstract

pdf

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters

Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)

Abstract

pdf

A Lightweight Model for Right-Sizing Master-Worker Applications

Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)

Abstract

pdf

Return to Top

Scheduling

Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass

Resource Management and Interference

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management

Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)

Abstract

pdf

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters

Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract

pdf

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing

Best Student Paper Finalists

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass

Clouds and Distributed Computing

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments

Abstract

pdf

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters

Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)

Abstract

pdf

A Lightweight Model for Right-Sizing Master-Worker Applications

Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)

Abstract

pdf

Return to Top

Scientific Computing

Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass

Biology Applications

Extreme Scale De Novo Metagenome Assembly

Best Paper Finalists

Abstract

pdf

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting

Abstract

pdf

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight

Abstract

pdf

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Physics and Tensor Applications

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight

Abstract

pdf

Accelerating Quantum Chemistry with Vectorized and Batched Integrals

Hua Huang and Edmond Chow (Georgia Institute of Technology)

Abstract

pdf

High-Performance Dense Tucker Decomposition on GPU Clusters

Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)

Abstract

pdf

Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Large Scale System Deployments

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems

Abstract

pdf

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience

Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)

Abstract

pdf

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Astrophysics Applications

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows

Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)

Abstract

pdf

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver

Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)

Abstract

pdf

Return to Top

Security

Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass

Graph Algorithms and Systems

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees

Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Adaptive Anonymization of Data with b-Edge Covers

Abstract

pdf

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU

Abstract

pdf

Return to Top

Sparse Computation

Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass

Algorithms on Sparse Data

HiCOO: Hierarchical Storage of Sparse Tensors

Best Student Paper Finalists

Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)

Abstract

pdf

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures

Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)

Abstract

pdf

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution

Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)

Abstract

pdf

Return to Top

State of the Practice

Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass

Resilience

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan

Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)

Abstract

pdf

FlipTracker: Understanding Natural Error Resilience in HPC Applications

Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)

Abstract

pdf

Doomsday: Predicting Which Node Will Fail When on Supercomputers

Best Student Paper Finalists

Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)

Abstract

pdf

Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass

Resource Management and Interference

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management

Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)

Abstract

pdf

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters

Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract

pdf

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing

Best Student Paper Finalists

Abstract

pdf

Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

MPI Optimization and Characterization

Cooperative Rendezvous Protocols for Improved Performance and Overlap

Best Student Paper Finalists

S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)

Abstract

pdf

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)

Abstract

pdf

Characterization of MPI Usage on a Production Supercomputer

Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)

Abstract

pdf

Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Large Scale System Deployments

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems

Abstract

pdf

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience

Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)

Abstract

pdf

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Abstract

pdf

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Abstract

pdf

A Year in the Life of a Parallel File System

Abstract

pdf

Return to Top

Storage

Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass

Data and Storage

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition

Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)

Abstract

pdf

BESPOKV: Application Tailored Scale-Out Key-Value Stores

Abstract

pdf

Scaling Embedded In Situ Indexing with DeltaFS

Abstract

pdf

Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Performance Optimization Studies

Many-Core Graph Workload Analysis

Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)

Abstract

pdf

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading

Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)

Abstract

pdf

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Return to Top

System Software

Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass

Resilience

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan

Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)

Abstract

pdf

FlipTracker: Understanding Natural Error Resilience in HPC Applications

Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)

Abstract

pdf

Doomsday: Predicting Which Node Will Fail When on Supercomputers

Best Student Paper Finalists

Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)

Abstract

pdf

Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass

Resource Management and Interference

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management

Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)

Abstract

pdf

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters

Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract

pdf

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing

Best Student Paper Finalists

Abstract

pdf

Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Non-Volatile Memory

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs

Kai Wu, Jie Ren, and Dong Li (University of California, Merced)

Abstract

pdf

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access

Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract

pdf

Siena: Exploring the Design Space of Heterogeneous Memory Systems

Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)

Abstract

pdf

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Abstract

pdf

A Year in the Life of a Parallel File System

Abstract

pdf

Return to Top

Tools

Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass

Performance and Energy Analysis

A Parallelism Profiler with What-If Analyses for OpenMP Programs

Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)

Abstract

pdf

Energy Efficiency Modeling of Parallel Applications

Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)

Abstract

pdf

HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor

John D. McCalpin (University of Texas, Texas Advanced Computing Center)

Abstract

pdf

Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Performance Optimization Studies

Many-Core Graph Workload Analysis

Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)

Abstract

pdf

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading

Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)

Abstract

pdf

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations

Abstract

pdf

Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Non-Volatile Memory

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs

Kai Wu, Jie Ren, and Dong Li (University of California, Merced)

Abstract

pdf

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access

Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract

pdf

Siena: Exploring the Design Space of Heterogeneous Memory Systems

Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)

Abstract

pdf

Paper · Performance, Resiliency, Tools, Tech Program Reg Pass

Resilience II

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo

Abstract

pdf

Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities

Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)

Abstract

pdf

Evaluating and Accelerating High-Fidelity Error Injection for HPC

Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)

Abstract

pdf

Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Large Scale System Deployments

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems

Abstract

pdf

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience

Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)

Abstract

pdf

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Abstract

pdf

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

Programming Systems Tools

Dynamic Data Race Detection for OpenMP Programs

Yizi Gu and John Mellor-Crummey (Rice University)

Abstract

pdf

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)

Abstract

pdf

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution

Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)

Abstract

pdf

Return to Top

Visualization

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Performance Optimization Studies

Many-Core Graph Workload Analysis

Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)

Abstract

pdf

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading

Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)

Abstract

pdf

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Return to Top

Workflows

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Abstract

pdf

A Year in the Life of a Parallel File System

Abstract

pdf

Return to Top

Tech Program Reg Pass

Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass

Next-Generation Networking

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage

Best Paper Finalists

Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)

Abstract

pdf

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences

Abstract

pdf

Light-Weight Protocols for Wire-Speed Ordering

Hans Eberle and Larry Dennison (Nvidia Corporation)

Abstract

pdf

Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass

Resilience

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan

Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)

Abstract

pdf

FlipTracker: Understanding Natural Error Resilience in HPC Applications

Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)

Abstract

pdf

Doomsday: Predicting Which Node Will Fail When on Supercomputers

Best Student Paper Finalists

Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)

Abstract

pdf

Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass

Data and Storage

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition

Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)

Abstract

pdf

BESPOKV: Application Tailored Scale-Out Key-Value Stores

Abstract

pdf

Scaling Embedded In Situ Indexing with DeltaFS

Abstract

pdf

Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass

Biology Applications

Extreme Scale De Novo Metagenome Assembly

Best Paper Finalists

Abstract

pdf

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting

Abstract

pdf

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight

Abstract

pdf

Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass

Performance and Energy Analysis

A Parallelism Profiler with What-If Analyses for OpenMP Programs

Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)

Abstract

pdf

Energy Efficiency Modeling of Parallel Applications

Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)

Abstract

pdf

HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor

John D. McCalpin (University of Texas, Texas Advanced Computing Center)

Abstract

pdf

Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Large-Scale Algorithms

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers

Abstract

pdf

TriCore: Parallel Triangle Counting on GPUs

Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Distributed-Memory Hierarchical Compression of Dense SPD Matrices

Best Student Paper Finalists

Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)

Abstract

pdf

Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass

Resource Management and Interference

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management

Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)

Abstract

pdf

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters

Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)

Abstract

pdf

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing

Best Student Paper Finalists

Abstract

pdf

Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass

Algorithms on Sparse Data

HiCOO: Hierarchical Storage of Sparse Tensors

Best Student Paper Finalists

Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)

Abstract

pdf

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures

Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)

Abstract

pdf

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution

Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)

Abstract

pdf

Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Performance Optimization Studies

Many-Core Graph Workload Analysis

Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)

Abstract

pdf

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading

Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)

Abstract

pdf

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations

Abstract

pdf

Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

MPI Optimization and Characterization

Cooperative Rendezvous Protocols for Improved Performance and Overlap

Best Student Paper Finalists

S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)

Abstract

pdf

Framework for Scalable Intra-Node Collective Operations Using Shared Memory

Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)

Abstract

pdf

Characterization of MPI Usage on a Production Supercomputer

Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)

Abstract

pdf

Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Non-Volatile Memory

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs

Kai Wu, Jie Ren, and Dong Li (University of California, Merced)

Abstract

pdf

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access

Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)

Abstract

pdf

Siena: Exploring the Design Space of Heterogeneous Memory Systems

Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)

Abstract

pdf

Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

Task-Based Programming

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes

Abstract

pdf

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs

Abstract

pdf

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints

Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Physics and Tensor Applications

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight

Abstract

pdf

Accelerating Quantum Chemistry with Vectorized and Batched Integrals

Hua Huang and Edmond Chow (Georgia Institute of Technology)

Abstract

pdf

High-Performance Dense Tucker Decomposition on GPU Clusters

Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)

Abstract

pdf

Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass

Clouds and Distributed Computing

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments

Abstract

pdf

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters

Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)

Abstract

pdf

A Lightweight Model for Right-Sizing Master-Worker Applications

Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)

Abstract

pdf

Paper · Performance, Resiliency, Tools, Tech Program Reg Pass

Resilience II

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo

Abstract

pdf

Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities

Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)

Abstract

pdf

Evaluating and Accelerating High-Fidelity Error Injection for HPC

Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)

Abstract

pdf

Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Large Scale System Deployments

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems

Abstract

pdf

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience

Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)

Abstract

pdf

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Abstract

pdf

Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Arithmetic and Optimization

Associative Instruction Reordering to Alleviate Register Pressure

Abstract

pdf

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers

Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)

Abstract

pdf

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning

Abstract

pdf

Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass

Graph Algorithms and Systems

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees

Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)

Abstract

pdf

Adaptive Anonymization of Data with b-Edge Covers

Abstract

pdf

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU

Abstract

pdf

Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

Programming Systems Tools

Dynamic Data Race Detection for OpenMP Programs

Yizi Gu and John Mellor-Crummey (Rice University)

Abstract

pdf

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)

Abstract

pdf

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution

Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)

Abstract

pdf

Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Deep Learning

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines

Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)

Abstract

pdf

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

pdf

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)

Abstract

pdf

Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

Resilience III: GPUs

Optimizing Software-Directed Instruction Replication for GPU Error Detection

Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)

Abstract

pdf

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Abstract

pdf

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods

Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)

Abstract

pdf

Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Astrophysics Applications

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows

Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)

Abstract

pdf

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver

Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)

Abstract

pdf

Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

File Systems: Data Movement and Provenance

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems

Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)

Abstract

pdf

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows

Abstract

pdf

A Year in the Life of a Parallel File System

Abstract

pdf

Return to Top

Other

ACM Gordon Bell Finalist

Gordon Bell Prize Finalist #1

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing

Tsuyoshi Ichimura, Kohei Fujita, and Takuma Yamaguchi (University of Tokyo); Akira Naruse (Nvidia Corporation); Jack C. Wells (Oak Ridge National Laboratory); Thomas C. Schulthess (Swiss National Supercomputing Centre); Tjerk P. Straatsma and Christopher J. Zimmer (Oak Ridge National Laboratory); Maxime Martinasso (Swiss National Supercomputing Centre); and Kengo Nakajima, Muneo Hori, and Lalith Maddegedara (University of Tokyo)

Abstract

pdf

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation

Robert M. Patton, J. Travis Johnston, Steven R. Young, Catherine D. Schuman, Don D. March, Thomas E. Potok, Derek C. Rose, Seung-Hwan Lim, Thomas P. Karnowski, Maxim A. Ziatdinov, and Sergei V. Kalinin (Oak Ridge National Laboratory)

Abstract

pdf

Exascale Deep Learning for Climate Analytics

Thorsten Kurth (Lawrence Berkeley National Laboratory), Sean Treichler and Joshua Romero (Nvidia Corporation), Mayur Mudigonda (Lawrence Berkeley National Laboratory), Nathan Luehr and Everett Phillips (Nvidia Corporation), Ankur Mahesh (Lawrence Berkeley National Laboratory), Michael Matheson (Oak Ridge National Laboratory), Jack Deslippe (Lawrence Berkeley National Laboratory), Massimiliano Fatica (Nvidia Corporation), Mr Prabhat (Lawrence Berkeley National Laboratory), and Michael Houston (Nvidia Corporation)

Abstract

pdf

ACM Gordon Bell Finalist

Gordon Bell Prize Finalist #2

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing

Evan Berkowitz (Forschungszentrum Juelich); M.A. Clark (Nvidia Corporation); Arjun Gambhir (Lawrence Livermore National Laboratory, Lawrence Berkeley National Laboratory); Ken McElvain (University of California, Berkeley; Lawrence Berkeley National Laboratory); Amy Nicholson (University of North Carolina); Enrico Rinaldi (RIKEN BNL Research Center, Lawrence Berkeley National Laboratory); Pavlos Vranas (Lawrence Livermore National Laboratory, Lawrence Berkeley National Laboratory); André Walker-Loud (Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory); Chia Cheng Chang (Lawrence Berkeley National Laboratory, RIKEN); Bálint Joó (Thomas Jefferson National Accelerator Facility); Thorsten Kurth (Lawrence Berkeley National Laboratory); and Kostas Orginos (College of William & Mary, Thomas Jefferson National Accelerator Facility)

Abstract

pdf

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds

Heng Lin (Tsinghua University, Fma Technology); Xiaowei Zhu (Tsinghua University, Qatar Computing Research Institute); Bowen Yu (Tsinghua University); Xiongchao Tang (Tsinghua University, Qatar Computing Research Institute); Wei Xue and Wenguang Chen (Tsinghua University); Lufei Zhang (State Key Laboratory of Mathematical Engineering and Advanced Computing); Torsten Hoefler (ETH Zurich); Xiaosong Ma (Qatar Computing Research Institute); Xin Liu (National Research Centre of Parallel Computer Engineering and Technology); Weimin Zheng (Tsinghua University); and Jingfang Xu (Beijing Sogou Technology Development Company)

Abstract

pdf

Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction

Wayne Joubert (Oak Ridge National Laboratory); Deborah Weighill (Oak Ridge National Laboratory, University of Tennessee); David Kainer (Oak Ridge National Laboratory); Sharlee Climer (University of Missouri, St Louis); Amy Justice (Yale University, US Department of Veterans Affairs); Kjiersten Fagnan (Lawrence Berkeley National Laboratory, US Department of Energy Joint Genome Institute); and Daniel Jacobson (Oak Ridge National Laboratory)

Abstract

pdf

Return to Top