SC18 Proceedings


Overview | By Event Type | By Tag | Author Index



Algorithms
Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass
Biology Applications
Extreme Scale De Novo Metagenome Assembly
Best Paper Finalists
Evangelos Georganas (Intel Corporation) and Rob Egan, Steven Hofmeyr, Eugene Goltsman, Bill Arndt, Andrew Tritt, Aydin Buluc, Leonid Oliker, and Katherine Yelick (Lawrence Berkeley National Laboratory)
Abstract
pdf
Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting
Tony C. Pan (Georgia Institute of Technology, School of Computational Science and Engineering); Sanchit Misra (Intel Corporation, Parallel Computing Lab); and Srinivas Aluru (Georgia Institute of Technology, School of Computational Science and Engineering)
Abstract
pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight
Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, and Weiguo Liu (Shandong University); Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, and Dexun Chen (Tsinghua University); Xiangxu Meng (Shandong University); and Guangwen Yang (Tsinghua University)
Abstract
pdf
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass
Algorithms on Sparse Data
HiCOO: Hierarchical Storage of Sparse Tensors
Best Student Paper Finalists
Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)
Abstract
pdf
Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures
Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)
Abstract
pdf
PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Physics and Tensor Applications
Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight
Bingwei Chen, Haohuan Fu, Yanwen Wei, and Conghui He (Tsinghua University; National Supercomputing Center, Wuxi); Wenqiang Zhang (University of Science and Technology of China); Yuxuan Li (Tsinghua University; National Supercomputing Center, Wuxi); Wubin Wan and Wei Zhang (National Supercomputing Center, Wuxi); Lin Gan (Tsinghua University; National Supercomputing Center, Wuxi); Wei Zhang and Zhenguo Zhang (Southern University of Science and Technology, China); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and Xiaofei Chen (Southern University of Science and Technology, China)
Abstract
pdf
Accelerating Quantum Chemistry with Vectorized and Batched Integrals
Hua Huang and Edmond Chow (Georgia Institute of Technology)
Abstract
pdf
High-Performance Dense Tucker Decomposition on GPU Clusters
Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)
Abstract
pdf
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass
Resilience III: GPUs
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)
Abstract
pdf
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)
Abstract
pdf
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Astrophysics Applications
Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows
Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)
Abstract
pdf
Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver
Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)
Abstract
pdf

Applications
Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass
Biology Applications
Extreme Scale De Novo Metagenome Assembly
Best Paper Finalists
Evangelos Georganas (Intel Corporation) and Rob Egan, Steven Hofmeyr, Eugene Goltsman, Bill Arndt, Andrew Tritt, Aydin Buluc, Leonid Oliker, and Katherine Yelick (Lawrence Berkeley National Laboratory)
Abstract
pdf
Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting
Tony C. Pan (Georgia Institute of Technology, School of Computational Science and Engineering); Sanchit Misra (Intel Corporation, Parallel Computing Lab); and Srinivas Aluru (Georgia Institute of Technology, School of Computational Science and Engineering)
Abstract
pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight
Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, and Weiguo Liu (Shandong University); Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, and Dexun Chen (Tsinghua University); Xiangxu Meng (Shandong University); and Guangwen Yang (Tsinghua University)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Physics and Tensor Applications
Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight
Bingwei Chen, Haohuan Fu, Yanwen Wei, and Conghui He (Tsinghua University; National Supercomputing Center, Wuxi); Wenqiang Zhang (University of Science and Technology of China); Yuxuan Li (Tsinghua University; National Supercomputing Center, Wuxi); Wubin Wan and Wei Zhang (National Supercomputing Center, Wuxi); Lin Gan (Tsinghua University; National Supercomputing Center, Wuxi); Wei Zhang and Zhenguo Zhang (Southern University of Science and Technology, China); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and Xiaofei Chen (Southern University of Science and Technology, China)
Abstract
pdf
Accelerating Quantum Chemistry with Vectorized and Batched Integrals
Hua Huang and Edmond Chow (Georgia Institute of Technology)
Abstract
pdf
High-Performance Dense Tucker Decomposition on GPU Clusters
Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)
Abstract
pdf
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass
Graph Algorithms and Systems
iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees
Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Adaptive Anonymization of Data with b-Edge Covers
Arif Khan (Pacific Northwest National Laboratory), Krzysztof Choromanski (Google LLC), Alex Pothen and S M Ferdous (Purdue University), and Mahantesh Halappanavar and Antonino Tumeo (Pacific Northwest National Laboratory)
Abstract
pdf
faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU
Martin Winter and Daniel Mlakar (Graz University of Technology); Rhaleb Zayer and Hans-Peter Seidel (Max Planck Institute for Informatics); and Markus Steinberger (Graz University of Technology, Max Planck Institute for Informatics)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Astrophysics Applications
Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows
Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)
Abstract
pdf
Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver
Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)
Abstract
pdf

Architectures
Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass
Next-Generation Networking
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Best Paper Finalists
Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)
Abstract
pdf
Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences
Qiao Xiang (Yale University); J. Jensen Zhang, X. Tony Wang, and Y. Jace Liu (Tongji University); Chin Guok (Lawrence Berkeley National Laboratory); Franck Le (IBM); John MacAuley (Lawrence Berkeley National Laboratory); Harvey Newman (California Institute of Technology); and Y. Richard Yang (Yale University)
Abstract
pdf
Light-Weight Protocols for Wire-Speed Ordering
Hans Eberle and Larry Dennison (Nvidia Corporation)
Abstract
pdf
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass
MPI Optimization and Characterization
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Best Student Paper Finalists
S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)
Abstract
pdf
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)
Abstract
pdf
Characterization of MPI Usage on a Production Supercomputer
Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf
Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass
Large Scale System Deployments
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)
Abstract
pdf
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)
Abstract
pdf
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)
Abstract
pdf
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass
Resilience III: GPUs
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)
Abstract
pdf
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)
Abstract
pdf
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)
Abstract
pdf
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

Clouds and Distributed Computing
Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass
Data and Storage
SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)
Abstract
pdf
BESPOKV: Application Tailored Scale-Out Key-Value Stores
Ali Anwar (IBM), Yue Cheng (George Mason University), Hai Huang (IBM), Jingoo Han (Virginia Tech), Hyogi Sim (Oak Ridge National Laboratory), Dongyoon Lee (Virginia Tech), Fred Douglis (Perspecta Labs), and Ali R. Butt (Virginia Tech)
Abstract
pdf
Scaling Embedded In Situ Indexing with DeltaFS
Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, and Garth A. Gibson (Carnegie Mellon University) and Bradley W. Settlemyer, Gary Grider, and Fan Guo (Los Alamos National Laboratory)
Abstract
pdf
Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass
Clouds and Distributed Computing
A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments
Georgios Andreadis (Delft University of Technology, Vrije University Amsterdam); Laurens Versluis (Vrije University Amsterdam); Fabian Mastenbroek (Delft University of Technology); and Alexandru Iosup (Vrije University Amsterdam, Delft University of Technology)
Abstract
pdf
Dynamically Negotiating Capacity Between On-Demand and Batch Clusters
Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)
Abstract
pdf
A Lightweight Model for Right-Sizing Master-Worker Applications
Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)
Abstract
pdf

Compiler Analysis and Optimization
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf

Computational Biology
Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass
Biology Applications
Extreme Scale De Novo Metagenome Assembly
Best Paper Finalists
Evangelos Georganas (Intel Corporation) and Rob Egan, Steven Hofmeyr, Eugene Goltsman, Bill Arndt, Andrew Tritt, Aydin Buluc, Leonid Oliker, and Katherine Yelick (Lawrence Berkeley National Laboratory)
Abstract
pdf
Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting
Tony C. Pan (Georgia Institute of Technology, School of Computational Science and Engineering); Sanchit Misra (Intel Corporation, Parallel Computing Lab); and Srinivas Aluru (Georgia Institute of Technology, School of Computational Science and Engineering)
Abstract
pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight
Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, and Weiguo Liu (Shandong University); Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, and Dexun Chen (Tsinghua University); Xiangxu Meng (Shandong University); and Guangwen Yang (Tsinghua University)
Abstract
pdf

Computational Physics
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Physics and Tensor Applications
Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight
Bingwei Chen, Haohuan Fu, Yanwen Wei, and Conghui He (Tsinghua University; National Supercomputing Center, Wuxi); Wenqiang Zhang (University of Science and Technology of China); Yuxuan Li (Tsinghua University; National Supercomputing Center, Wuxi); Wubin Wan and Wei Zhang (National Supercomputing Center, Wuxi); Lin Gan (Tsinghua University; National Supercomputing Center, Wuxi); Wei Zhang and Zhenguo Zhang (Southern University of Science and Technology, China); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and Xiaofei Chen (Southern University of Science and Technology, China)
Abstract
pdf
Accelerating Quantum Chemistry with Vectorized and Batched Integrals
Hua Huang and Edmond Chow (Georgia Institute of Technology)
Abstract
pdf
High-Performance Dense Tucker Decomposition on GPU Clusters
Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Astrophysics Applications
Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows
Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)
Abstract
pdf
Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver
Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)
Abstract
pdf

Cosmology
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf

Data Analytics
Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass
Next-Generation Networking
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Best Paper Finalists
Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)
Abstract
pdf
Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences
Qiao Xiang (Yale University); J. Jensen Zhang, X. Tony Wang, and Y. Jace Liu (Tongji University); Chin Guok (Lawrence Berkeley National Laboratory); Franck Le (IBM); John MacAuley (Lawrence Berkeley National Laboratory); Harvey Newman (California Institute of Technology); and Y. Richard Yang (Yale University)
Abstract
pdf
Light-Weight Protocols for Wire-Speed Ordering
Hans Eberle and Larry Dennison (Nvidia Corporation)
Abstract
pdf
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass
Performance Optimization Studies
Many-Core Graph Workload Analysis
Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)
Abstract
pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)
Abstract
pdf
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf

Data Management
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

Deep Learning
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf

File Systems
Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass
Data and Storage
SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)
Abstract
pdf
BESPOKV: Application Tailored Scale-Out Key-Value Stores
Ali Anwar (IBM), Yue Cheng (George Mason University), Hai Huang (IBM), Jingoo Han (Virginia Tech), Hyogi Sim (Oak Ridge National Laboratory), Dongyoon Lee (Virginia Tech), Fred Douglis (Perspecta Labs), and Ali R. Butt (Virginia Tech)
Abstract
pdf
Scaling Embedded In Situ Indexing with DeltaFS
Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, and Garth A. Gibson (Carnegie Mellon University) and Bradley W. Settlemyer, Gary Grider, and Fan Guo (Los Alamos National Laboratory)
Abstract
pdf
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

Floating Point
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf

GPUs
Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass
Resilience
GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan
Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)
Abstract
pdf
FlipTracker: Understanding Natural Error Resilience in HPC Applications
Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)
Abstract
pdf
Doomsday: Predicting Which Node Will Fail When on Supercomputers
Best Student Paper Finalists
Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)
Abstract
pdf
Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass
Non-Volatile Memory
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
Kai Wu, Jie Ren, and Dong Li (University of California, Merced)
Abstract
pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)
Abstract
pdf
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass
Resilience III: GPUs
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)
Abstract
pdf
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)
Abstract
pdf
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)
Abstract
pdf

Graph Algorithms
Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass
Algorithms on Sparse Data
HiCOO: Hierarchical Storage of Sparse Tensors
Best Student Paper Finalists
Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)
Abstract
pdf
Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures
Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)
Abstract
pdf
PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass
Graph Algorithms and Systems
iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees
Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Adaptive Anonymization of Data with b-Edge Covers
Arif Khan (Pacific Northwest National Laboratory), Krzysztof Choromanski (Google LLC), Alex Pothen and S M Ferdous (Purdue University), and Mahantesh Halappanavar and Antonino Tumeo (Pacific Northwest National Laboratory)
Abstract
pdf
faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU
Martin Winter and Daniel Mlakar (Graz University of Technology); Rhaleb Zayer and Hans-Peter Seidel (Max Planck Institute for Informatics); and Markus Steinberger (Graz University of Technology, Max Planck Institute for Informatics)
Abstract
pdf

I/O
Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass
Data and Storage
SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)
Abstract
pdf
BESPOKV: Application Tailored Scale-Out Key-Value Stores
Ali Anwar (IBM), Yue Cheng (George Mason University), Hai Huang (IBM), Jingoo Han (Virginia Tech), Hyogi Sim (Oak Ridge National Laboratory), Dongyoon Lee (Virginia Tech), Fred Douglis (Perspecta Labs), and Ali R. Butt (Virginia Tech)
Abstract
pdf
Scaling Embedded In Situ Indexing with DeltaFS
Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, and Garth A. Gibson (Carnegie Mellon University) and Bradley W. Settlemyer, Gary Grider, and Fan Guo (Los Alamos National Laboratory)
Abstract
pdf

Linear Algebra
Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass
Algorithms on Sparse Data
HiCOO: Hierarchical Storage of Sparse Tensors
Best Student Paper Finalists
Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)
Abstract
pdf
Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures
Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)
Abstract
pdf
PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass
Programming Systems Tools
Dynamic Data Race Detection for OpenMP Programs
Yizi Gu and John Mellor-Crummey (Rice University)
Abstract
pdf
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)
Abstract
pdf
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)
Abstract
pdf
Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass
Resilience III: GPUs
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)
Abstract
pdf
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)
Abstract
pdf
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)
Abstract
pdf

Machine Learning
Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass
Algorithms on Sparse Data
HiCOO: Hierarchical Storage of Sparse Tensors
Best Student Paper Finalists
Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)
Abstract
pdf
Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures
Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)
Abstract
pdf
PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf

Memory
Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass
Non-Volatile Memory
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
Kai Wu, Jie Ren, and Dong Li (University of California, Merced)
Abstract
pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)
Abstract
pdf
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf
Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass
Programming Systems Tools
Dynamic Data Race Detection for OpenMP Programs
Yizi Gu and John Mellor-Crummey (Rice University)
Abstract
pdf
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)
Abstract
pdf
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)
Abstract
pdf

MPI
Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass
MPI Optimization and Characterization
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Best Student Paper Finalists
S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)
Abstract
pdf
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)
Abstract
pdf
Characterization of MPI Usage on a Production Supercomputer
Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)
Abstract
pdf
Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass
Programming Systems Tools
Dynamic Data Race Detection for OpenMP Programs
Yizi Gu and John Mellor-Crummey (Rice University)
Abstract
pdf
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)
Abstract
pdf
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)
Abstract
pdf

NVRAM
Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass
Non-Volatile Memory
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
Kai Wu, Jie Ren, and Dong Li (University of California, Merced)
Abstract
pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)
Abstract
pdf
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)
Abstract
pdf

Networks
Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass
Next-Generation Networking
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Best Paper Finalists
Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)
Abstract
pdf
Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences
Qiao Xiang (Yale University); J. Jensen Zhang, X. Tony Wang, and Y. Jace Liu (Tongji University); Chin Guok (Lawrence Berkeley National Laboratory); Franck Le (IBM); John MacAuley (Lawrence Berkeley National Laboratory); Harvey Newman (California Institute of Technology); and Y. Richard Yang (Yale University)
Abstract
pdf
Light-Weight Protocols for Wire-Speed Ordering
Hans Eberle and Larry Dennison (Nvidia Corporation)
Abstract
pdf
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass
Resource Management and Interference
RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management
Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)
Abstract
pdf
Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters
Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
Best Student Paper Finalists
Staci A. Smith, Clara E. Cromey, and David K. Lowenthal (University of Arizona); Jens Domke (Tokyo Institute of Technology); and Nikhil Jain, Jayaraman J. Thiagarajan, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass
MPI Optimization and Characterization
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Best Student Paper Finalists
S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)
Abstract
pdf
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)
Abstract
pdf
Characterization of MPI Usage on a Production Supercomputer
Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf
Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass
Large Scale System Deployments
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)
Abstract
pdf
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)
Abstract
pdf
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)
Abstract
pdf
Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass
Resilience III: GPUs
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)
Abstract
pdf
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)
Abstract
pdf
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)
Abstract
pdf
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

OpenMP
Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass
Performance and Energy Analysis
A Parallelism Profiler with What-If Analyses for OpenMP Programs
Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)
Abstract
pdf
Energy Efficiency Modeling of Parallel Applications
Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)
Abstract
pdf
HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor
John D. McCalpin (University of Texas, Texas Advanced Computing Center)
Abstract
pdf
Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass
Programming Systems Tools
Dynamic Data Race Detection for OpenMP Programs
Yizi Gu and John Mellor-Crummey (Rice University)
Abstract
pdf
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)
Abstract
pdf
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)
Abstract
pdf

Parallel Programming Languages, Libraries, and Models
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf

Performance
Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass
Performance and Energy Analysis
A Parallelism Profiler with What-If Analyses for OpenMP Programs
Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)
Abstract
pdf
Energy Efficiency Modeling of Parallel Applications
Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)
Abstract
pdf
HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor
John D. McCalpin (University of Texas, Texas Advanced Computing Center)
Abstract
pdf
Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass
Performance Optimization Studies
Many-Core Graph Workload Analysis
Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)
Abstract
pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)
Abstract
pdf
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)
Abstract
pdf
Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass
MPI Optimization and Characterization
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Best Student Paper Finalists
S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)
Abstract
pdf
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)
Abstract
pdf
Characterization of MPI Usage on a Production Supercomputer
Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)
Abstract
pdf
Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass
Non-Volatile Memory
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
Kai Wu, Jie Ren, and Dong Li (University of California, Merced)
Abstract
pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)
Abstract
pdf
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)
Abstract
pdf
Paper · Performance, Resiliency, Tools, Tech Program Reg Pass
Resilience II
Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo
Scott Levy and Kurt B. Ferreira (Sandia National Laboratories), Nathan DeBardeleben (Los Alamos National Laboratory), Taniya Siddiqua and Vilas Sridharan (Advanced Micro Devices Inc), and Elisabeth Baseman (Los Alamos National Laboratory)
Abstract
pdf
Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities
Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)
Abstract
pdf
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)
Abstract
pdf
Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass
Large Scale System Deployments
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)
Abstract
pdf
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)
Abstract
pdf
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)
Abstract
pdf
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf

Power
Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass
Performance and Energy Analysis
A Parallelism Profiler with What-If Analyses for OpenMP Programs
Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)
Abstract
pdf
Energy Efficiency Modeling of Parallel Applications
Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)
Abstract
pdf
HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor
John D. McCalpin (University of Texas, Texas Advanced Computing Center)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf

Precision
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf

Programming Systems
Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass
Performance Optimization Studies
Many-Core Graph Workload Analysis
Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)
Abstract
pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)
Abstract
pdf
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)
Abstract
pdf
Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass
MPI Optimization and Characterization
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Best Student Paper Finalists
S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)
Abstract
pdf
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)
Abstract
pdf
Characterization of MPI Usage on a Production Supercomputer
Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass
Programming Systems Tools
Dynamic Data Race Detection for OpenMP Programs
Yizi Gu and John Mellor-Crummey (Rice University)
Abstract
pdf
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)
Abstract
pdf
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf

Resiliency
Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass
Resilience
GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan
Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)
Abstract
pdf
FlipTracker: Understanding Natural Error Resilience in HPC Applications
Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)
Abstract
pdf
Doomsday: Predicting Which Node Will Fail When on Supercomputers
Best Student Paper Finalists
Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)
Abstract
pdf
Paper · Performance, Resiliency, Tools, Tech Program Reg Pass
Resilience II
Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo
Scott Levy and Kurt B. Ferreira (Sandia National Laboratories), Nathan DeBardeleben (Los Alamos National Laboratory), Taniya Siddiqua and Vilas Sridharan (Advanced Micro Devices Inc), and Elisabeth Baseman (Los Alamos National Laboratory)
Abstract
pdf
Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities
Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)
Abstract
pdf
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)
Abstract
pdf
Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass
Resilience III: GPUs
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)
Abstract
pdf
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)
Abstract
pdf
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)
Abstract
pdf

Resource Management
Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass
Resource Management and Interference
RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management
Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)
Abstract
pdf
Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters
Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
Best Student Paper Finalists
Staci A. Smith, Clara E. Cromey, and David K. Lowenthal (University of Arizona); Jens Domke (Tokyo Institute of Technology); and Nikhil Jain, Jayaraman J. Thiagarajan, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass
Clouds and Distributed Computing
A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments
Georgios Andreadis (Delft University of Technology, Vrije University Amsterdam); Laurens Versluis (Vrije University Amsterdam); Fabian Mastenbroek (Delft University of Technology); and Alexandru Iosup (Vrije University Amsterdam, Delft University of Technology)
Abstract
pdf
Dynamically Negotiating Capacity Between On-Demand and Batch Clusters
Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)
Abstract
pdf
A Lightweight Model for Right-Sizing Master-Worker Applications
Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)
Abstract
pdf

Scheduling
Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass
Resource Management and Interference
RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management
Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)
Abstract
pdf
Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters
Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
Best Student Paper Finalists
Staci A. Smith, Clara E. Cromey, and David K. Lowenthal (University of Arizona); Jens Domke (Tokyo Institute of Technology); and Nikhil Jain, Jayaraman J. Thiagarajan, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf
Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass
Clouds and Distributed Computing
A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments
Georgios Andreadis (Delft University of Technology, Vrije University Amsterdam); Laurens Versluis (Vrije University Amsterdam); Fabian Mastenbroek (Delft University of Technology); and Alexandru Iosup (Vrije University Amsterdam, Delft University of Technology)
Abstract
pdf
Dynamically Negotiating Capacity Between On-Demand and Batch Clusters
Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)
Abstract
pdf
A Lightweight Model for Right-Sizing Master-Worker Applications
Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)
Abstract
pdf

Scientific Computing
Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass
Biology Applications
Extreme Scale De Novo Metagenome Assembly
Best Paper Finalists
Evangelos Georganas (Intel Corporation) and Rob Egan, Steven Hofmeyr, Eugene Goltsman, Bill Arndt, Andrew Tritt, Aydin Buluc, Leonid Oliker, and Katherine Yelick (Lawrence Berkeley National Laboratory)
Abstract
pdf
Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting
Tony C. Pan (Georgia Institute of Technology, School of Computational Science and Engineering); Sanchit Misra (Intel Corporation, Parallel Computing Lab); and Srinivas Aluru (Georgia Institute of Technology, School of Computational Science and Engineering)
Abstract
pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight
Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, and Weiguo Liu (Shandong University); Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, and Dexun Chen (Tsinghua University); Xiangxu Meng (Shandong University); and Guangwen Yang (Tsinghua University)
Abstract
pdf
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Physics and Tensor Applications
Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight
Bingwei Chen, Haohuan Fu, Yanwen Wei, and Conghui He (Tsinghua University; National Supercomputing Center, Wuxi); Wenqiang Zhang (University of Science and Technology of China); Yuxuan Li (Tsinghua University; National Supercomputing Center, Wuxi); Wubin Wan and Wei Zhang (National Supercomputing Center, Wuxi); Lin Gan (Tsinghua University; National Supercomputing Center, Wuxi); Wei Zhang and Zhenguo Zhang (Southern University of Science and Technology, China); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and Xiaofei Chen (Southern University of Science and Technology, China)
Abstract
pdf
Accelerating Quantum Chemistry with Vectorized and Batched Integrals
Hua Huang and Edmond Chow (Georgia Institute of Technology)
Abstract
pdf
High-Performance Dense Tucker Decomposition on GPU Clusters
Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)
Abstract
pdf
Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass
Large Scale System Deployments
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)
Abstract
pdf
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)
Abstract
pdf
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Astrophysics Applications
Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows
Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)
Abstract
pdf
Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver
Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)
Abstract
pdf

Security
Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass
Graph Algorithms and Systems
iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees
Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Adaptive Anonymization of Data with b-Edge Covers
Arif Khan (Pacific Northwest National Laboratory), Krzysztof Choromanski (Google LLC), Alex Pothen and S M Ferdous (Purdue University), and Mahantesh Halappanavar and Antonino Tumeo (Pacific Northwest National Laboratory)
Abstract
pdf
faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU
Martin Winter and Daniel Mlakar (Graz University of Technology); Rhaleb Zayer and Hans-Peter Seidel (Max Planck Institute for Informatics); and Markus Steinberger (Graz University of Technology, Max Planck Institute for Informatics)
Abstract
pdf

Sparse Computation
Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass
Algorithms on Sparse Data
HiCOO: Hierarchical Storage of Sparse Tensors
Best Student Paper Finalists
Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)
Abstract
pdf
Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures
Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)
Abstract
pdf
PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)
Abstract
pdf

State of the Practice
Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass
Resilience
GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan
Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)
Abstract
pdf
FlipTracker: Understanding Natural Error Resilience in HPC Applications
Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)
Abstract
pdf
Doomsday: Predicting Which Node Will Fail When on Supercomputers
Best Student Paper Finalists
Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)
Abstract
pdf
Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass
Resource Management and Interference
RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management
Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)
Abstract
pdf
Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters
Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
Best Student Paper Finalists
Staci A. Smith, Clara E. Cromey, and David K. Lowenthal (University of Arizona); Jens Domke (Tokyo Institute of Technology); and Nikhil Jain, Jayaraman J. Thiagarajan, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass
MPI Optimization and Characterization
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Best Student Paper Finalists
S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)
Abstract
pdf
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)
Abstract
pdf
Characterization of MPI Usage on a Production Supercomputer
Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)
Abstract
pdf
Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass
Large Scale System Deployments
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)
Abstract
pdf
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)
Abstract
pdf
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)
Abstract
pdf
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

Storage
Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass
Data and Storage
SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)
Abstract
pdf
BESPOKV: Application Tailored Scale-Out Key-Value Stores
Ali Anwar (IBM), Yue Cheng (George Mason University), Hai Huang (IBM), Jingoo Han (Virginia Tech), Hyogi Sim (Oak Ridge National Laboratory), Dongyoon Lee (Virginia Tech), Fred Douglis (Perspecta Labs), and Ali R. Butt (Virginia Tech)
Abstract
pdf
Scaling Embedded In Situ Indexing with DeltaFS
Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, and Garth A. Gibson (Carnegie Mellon University) and Bradley W. Settlemyer, Gary Grider, and Fan Guo (Los Alamos National Laboratory)
Abstract
pdf
Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass
Performance Optimization Studies
Many-Core Graph Workload Analysis
Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)
Abstract
pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)
Abstract
pdf
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf

System Software
Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass
Resilience
GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan
Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)
Abstract
pdf
FlipTracker: Understanding Natural Error Resilience in HPC Applications
Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)
Abstract
pdf
Doomsday: Predicting Which Node Will Fail When on Supercomputers
Best Student Paper Finalists
Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)
Abstract
pdf
Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass
Resource Management and Interference
RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management
Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)
Abstract
pdf
Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters
Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
Best Student Paper Finalists
Staci A. Smith, Clara E. Cromey, and David K. Lowenthal (University of Arizona); Jens Domke (Tokyo Institute of Technology); and Nikhil Jain, Jayaraman J. Thiagarajan, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass
Non-Volatile Memory
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
Kai Wu, Jie Ren, and Dong Li (University of California, Merced)
Abstract
pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)
Abstract
pdf
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)
Abstract
pdf
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

Tools
Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass
Performance and Energy Analysis
A Parallelism Profiler with What-If Analyses for OpenMP Programs
Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)
Abstract
pdf
Energy Efficiency Modeling of Parallel Applications
Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)
Abstract
pdf
HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor
John D. McCalpin (University of Texas, Texas Advanced Computing Center)
Abstract
pdf
Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass
Performance Optimization Studies
Many-Core Graph Workload Analysis
Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)
Abstract
pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)
Abstract
pdf
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)
Abstract
pdf
Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass
Non-Volatile Memory
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
Kai Wu, Jie Ren, and Dong Li (University of California, Merced)
Abstract
pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)
Abstract
pdf
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)
Abstract
pdf
Paper · Performance, Resiliency, Tools, Tech Program Reg Pass
Resilience II
Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo
Scott Levy and Kurt B. Ferreira (Sandia National Laboratories), Nathan DeBardeleben (Los Alamos National Laboratory), Taniya Siddiqua and Vilas Sridharan (Advanced Micro Devices Inc), and Elisabeth Baseman (Los Alamos National Laboratory)
Abstract
pdf
Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities
Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)
Abstract
pdf
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)
Abstract
pdf
Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass
Large Scale System Deployments
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)
Abstract
pdf
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)
Abstract
pdf
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)
Abstract
pdf
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass
Programming Systems Tools
Dynamic Data Race Detection for OpenMP Programs
Yizi Gu and John Mellor-Crummey (Rice University)
Abstract
pdf
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)
Abstract
pdf
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)
Abstract
pdf

Visualization
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass
Performance Optimization Studies
Many-Core Graph Workload Analysis
Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)
Abstract
pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)
Abstract
pdf
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf

Workflows
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

Tech Program Reg Pass
Paper · Architectures, Data Analytics, Networks, Tech Program Reg Pass
Next-Generation Networking
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Best Paper Finalists
Matthias A. Blumrich, Nan Jiang, and Larry R. Dennison (Nvidia Corporation)
Abstract
pdf
Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences
Qiao Xiang (Yale University); J. Jensen Zhang, X. Tony Wang, and Y. Jace Liu (Tongji University); Chin Guok (Lawrence Berkeley National Laboratory); Franck Le (IBM); John MacAuley (Lawrence Berkeley National Laboratory); Harvey Newman (California Institute of Technology); and Y. Richard Yang (Yale University)
Abstract
pdf
Light-Weight Protocols for Wire-Speed Ordering
Hans Eberle and Larry Dennison (Nvidia Corporation)
Abstract
pdf
Paper · GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass
Resilience
GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan
Christopher Zimmer, Don Maxwell, Stephen McNally, Scott Atchley, and Sudharshan S. Vazhkudai (Oak Ridge National Laboratory)
Abstract
pdf
FlipTracker: Understanding Natural Error Resilience in HPC Applications
Luanzheng Guo and Dong Li (University of California, Merced); Ignacio Laguna (Lawrence Livermore National Laboratory); and Martin Schulz (Technical University Munich)
Abstract
pdf
Doomsday: Predicting Which Node Will Fail When on Supercomputers
Best Student Paper Finalists
Anwesha Das and Frank Mueller (North Carolina State University) and Paul Hargrove, Eric Roman, and Scott Baden (Lawrence Berkeley National Laboratory)
Abstract
pdf
Paper · Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass
Data and Storage
SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief (Hong Kong University of Science and Technology)
Abstract
pdf
BESPOKV: Application Tailored Scale-Out Key-Value Stores
Ali Anwar (IBM), Yue Cheng (George Mason University), Hai Huang (IBM), Jingoo Han (Virginia Tech), Hyogi Sim (Oak Ridge National Laboratory), Dongyoon Lee (Virginia Tech), Fred Douglis (Perspecta Labs), and Ali R. Butt (Virginia Tech)
Abstract
pdf
Scaling Embedded In Situ Indexing with DeltaFS
Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, and Garth A. Gibson (Carnegie Mellon University) and Bradley W. Settlemyer, Gary Grider, and Fan Guo (Los Alamos National Laboratory)
Abstract
pdf
Paper · Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass
Biology Applications
Extreme Scale De Novo Metagenome Assembly
Best Paper Finalists
Evangelos Georganas (Intel Corporation) and Rob Egan, Steven Hofmeyr, Eugene Goltsman, Bill Arndt, Andrew Tritt, Aydin Buluc, Leonid Oliker, and Katherine Yelick (Lawrence Berkeley National Laboratory)
Abstract
pdf
Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting
Tony C. Pan (Georgia Institute of Technology, School of Computational Science and Engineering); Sanchit Misra (Intel Corporation, Parallel Computing Lab); and Srinivas Aluru (Georgia Institute of Technology, School of Computational Science and Engineering)
Abstract
pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight
Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, and Weiguo Liu (Shandong University); Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, and Dexun Chen (Tsinghua University); Xiangxu Meng (Shandong University); and Guangwen Yang (Tsinghua University)
Abstract
pdf
Paper · OpenMP, Performance, Power, Tools, Tech Program Reg Pass
Performance and Energy Analysis
A Parallelism Profiler with What-If Analyses for OpenMP Programs
Nader Boushehrinejadmoradi, Adarsh Yoga, and Santosh Nagarakatte (Rutgers University)
Abstract
pdf
Energy Efficiency Modeling of Parallel Applications
Mark Endrei, Chao Jin, Minh Ngoc Dinh, and David Abramson (University of Queensland); Heidi Poxon and Luiz DeRose (Cray Inc); and Bronis R. de Supinski (Lawrence Livermore National Laboratory)
Abstract
pdf
HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor
John D. McCalpin (University of Texas, Texas Advanced Computing Center)
Abstract
pdf
Paper · Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass
Large-Scale Algorithms
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
Liandeng Li (Tsinghua University; National Supercomputing Center, Wuxi); Teng Yu (University of St Andrews); Wenlai Zhao and Haohuan Fu (Tsinghua University; National Supercomputing Center, Wuxi); Chenyu Wang (University of St Andrews; National Supercomputing Center, Wuxi); Li Tan (Beijing Technology and Business University); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and John Thomson (University of St Andrews)
Abstract
pdf
TriCore: Parallel Triangle Counting on GPUs
Yang Hu (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Best Student Paper Finalists
Chenhan D. Yu (University of Texas), Severin Reiz (Technical University Munich), and George Biros (University of Texas)
Abstract
pdf
Paper · Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass
Resource Management and Interference
RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management
Maxime Martinasso, Miguel Gila, Mauro Bianco, Sadaf R. Alam, Colin McMurtrie, and Thomas C. Schulthess (Swiss National Supercomputing Centre)
Abstract
pdf
Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters
Samuel D. Pollard (University of Oregon) and Nikhil Jain, Stephen Herbein, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
Best Student Paper Finalists
Staci A. Smith, Clara E. Cromey, and David K. Lowenthal (University of Arizona); Jens Domke (Tokyo Institute of Technology); and Nikhil Jain, Jayaraman J. Thiagarajan, and Abhinav Bhatele (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass
Algorithms on Sparse Data
HiCOO: Hierarchical Storage of Sparse Tensors
Best Student Paper Finalists
Jiajia Li, Jimeng Sun, and Richard Vuduc (Georgia Institute of Technology)
Abstract
pdf
Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures
Aryan Eftekhari (University of Lugano), Matthias Bollhöfer (Braunschweig University of Technology), and Olaf Schenk (University of Lugano)
Abstract
pdf
PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
Tahsin Reza, Matei Ripeanu, and Nicolas Tripoul (University of British Columbia) and Geoffrey Sanders and Roger Pearce (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass
Performance Optimization Studies
Many-Core Graph Workload Analysis
Stijn Eyerman, Wim Heirman, Kristof Du Bois, Joshua B. Fryman, and Ibrahim Hur (Intel Corporation)
Abstract
pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Shintaro Iwasaki (University of Tokyo), Abdelhalim Amer (Argonne National Laboratory), Kenjiro Taura (University of Tokyo), and Pavan Balaji (Argonne National Laboratory)
Abstract
pdf
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Preeti Malakar (Indian Institute of Technology Kanpur); Todd Munson, Christopher Knight, and Venkatram Vishwanath (Argonne National Laboratory); and Michael E. Papka (Argonne National Laboratory, Northern Illinois University)
Abstract
pdf
Paper · Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass
MPI Optimization and Characterization
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Best Student Paper Finalists
S. Chakraborty, M. Bayatpour, J. Hashmi, H. Subramoni, and D. K. Panda (Ohio State University)
Abstract
pdf
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Surabhi Jain, Rashid Kaleem, Marc Gamell Balmana, Akhil Langer, Dmitry Durnov, Alexander Sannikov, and Maria Garzaran (Intel Corporation)
Abstract
pdf
Characterization of MPI Usage on a Production Supercomputer
Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran (Argonne National Laboratory)
Abstract
pdf
Paper · GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass
Non-Volatile Memory
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
Kai Wu, Jie Ren, and Dong Li (University of California, Merced)
Abstract
pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Pak Markthub (Tokyo Institute of Technology); Mehmet E. Belviranli, Seyong Lee, and Jeffrey S. Vetter (Oak Ridge National Laboratory); and Satoshi Matsuoka (RIKEN, Tokyo Institute of Technology)
Abstract
pdf
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Ivy B. Peng and Jeffrey S. Vetter (Oak Ridge National Laboratory)
Abstract
pdf
Paper · Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass
Task-Based Programming
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Wonchan Lee (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), Michael Bauer and Sean Treichler (Nvidia Corporation), Todd Warszawski (Stanford University), Michael Garland (Nvidia Corporation), and Alex Aiken (Stanford University)
Abstract
pdf
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
Paul Caheny (Barcelona Supercomputing Center, Polytechnic University of Catalonia); Lluc Alvarez (Barcelona Supercomputing Center); Mateo Valero and Miquel Moretó (Barcelona Supercomputing Center, Polytechnic University of Catalonia); and Marc Casas (Barcelona Supercomputing Center)
Abstract
pdf
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Gökalp Demirci, Ivana Marincic, and Henry Hoffmann (University of Chicago)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Physics and Tensor Applications
Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight
Bingwei Chen, Haohuan Fu, Yanwen Wei, and Conghui He (Tsinghua University; National Supercomputing Center, Wuxi); Wenqiang Zhang (University of Science and Technology of China); Yuxuan Li (Tsinghua University; National Supercomputing Center, Wuxi); Wubin Wan and Wei Zhang (National Supercomputing Center, Wuxi); Lin Gan (Tsinghua University; National Supercomputing Center, Wuxi); Wei Zhang and Zhenguo Zhang (Southern University of Science and Technology, China); Guangwen Yang (Tsinghua University; National Supercomputing Center, Wuxi); and Xiaofei Chen (Southern University of Science and Technology, China)
Abstract
pdf
Accelerating Quantum Chemistry with Vectorized and Batched Integrals
Hua Huang and Edmond Chow (Georgia Institute of Technology)
Abstract
pdf
High-Performance Dense Tucker Decomposition on GPU Clusters
Jee Choi (IBM), Xing Liu (Intel Corporation), and Venkatesan Chakaravarthy (IBM)
Abstract
pdf
Paper · Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass
Clouds and Distributed Computing
A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments
Georgios Andreadis (Delft University of Technology, Vrije University Amsterdam); Laurens Versluis (Vrije University Amsterdam); Fabian Mastenbroek (Delft University of Technology); and Alexandru Iosup (Vrije University Amsterdam, Delft University of Technology)
Abstract
pdf
Dynamically Negotiating Capacity Between On-Demand and Batch Clusters
Feng Liu (University of Minnesota), Kate Keahey (Argonne National Laboratory), Pierre Riteau (University of Chicago), and Jon Weissman (University of Minnesota)
Abstract
pdf
A Lightweight Model for Right-Sizing Master-Worker Applications
Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain (University of Notre Dame)
Abstract
pdf
Paper · Performance, Resiliency, Tools, Tech Program Reg Pass
Resilience II
Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo
Scott Levy and Kurt B. Ferreira (Sandia National Laboratories), Nathan DeBardeleben (Los Alamos National Laboratory), Taniya Siddiqua and Vilas Sridharan (Advanced Micro Devices Inc), and Elisabeth Baseman (Los Alamos National Laboratory)
Abstract
pdf
Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities
Zaeem Hussain, Taieb Znati, and Rami Melhem (University of Pittsburgh)
Abstract
pdf
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Chun-Kai Chang, Sangkug Lym, and Nicholas Kelly (University of Texas); Michael B. Sullivan (Nvidia Corporation); and Mattan Erez (University of Texas)
Abstract
pdf
Paper · Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass
Large Scale System Deployments
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Sudharshan S. Vazhkudai (Oak Ridge National Laboratory); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Arthur S. Bland and Al Geist (Oak Ridge National Laboratory); James Sexton and Jim Kahle (IBM); Christopher J. Zimmer, Scott Atchley, Sarp H. Oral, Don E. Maxwell, and Veronica G. Vergara Larrea (Oak Ridge National Laboratory); Adam Bertsch and Robin Goldstone (Lawrence Livermore National Laboratory); Wayne Joubert (Oak Ridge National Laboratory); Chris Chambreau (Lawrence Livermore National Laboratory); David Appelhans and Robert Blackmore (IBM); Ben Casses (Lawrence Livermore National Laboratory); George Chochia and Gene Davison (IBM); Matthew A. Ezell (Oak Ridge National Laboratory); Tom Gooding (IBM); Elsa Gonsiorowski (Lawrence Livermore National Laboratory); Leopold Grinberg, Bill Hanson, and Bill Hartner (IBM); Ian Karlin and Matthew L. Leininger (Lawrence Livermore National Laboratory); Dustin Leverman (Oak Ridge National Laboratory); Chris Marroquin (IBM); Adam Moody (Lawrence Livermore National Laboratory); Martin Ohmacht (IBM); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Fernando Pizzano (IBM); James H. Rogers (Oak Ridge National Laboratory); Bryan Rosenburg (IBM); Drew Schmidt, Mallikarjun Shankar, and Feiyi Wang (Oak Ridge National Laboratory); Py Watson (Lawrence Livermore National Laboratory); Bob Walkup (IBM); Lance D. Weems (Lawrence Livermore National Laboratory); and Junqi Yin (Oak Ridge National Laboratory)
Abstract
pdf
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Gregory H. Bauer, Brett Bode, Jeremy Enos, William T. Kramer, Scott Lathrop, Celso L. Mendes, and Roberto R. Sisneros (University of Illinois, National Center for Supercomputing Applications)
Abstract
pdf
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Kazuhiko Komatsu (Tohoku University); Shintaro Momose, Yoko Isobe, Osamu Watanabe, and Akihiro Musa (Tohoku University, NEC Corporation); Mitsuo Yokokawa (Kobe University, NEC Corporation); Toshikazu Aoyama (NEC Corporation); and Masayuki Sato and Hiroaki Kobayashi (Tohoku University)
Abstract
pdf
Paper · Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass
Arithmetic and Optimization
Associative Instruction Reordering to Alleviate Register Pressure
Prashant Singh Rawat, Aravind Sukumaran-Rajam, and Atanas Rountev (Ohio State University); Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)); Louis-Noel Pouchet (Colorado State University); and P. Sadayappan (Ohio State University)
Abstract
pdf
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
Azzam Haidar (University of Tennessee, Innovative Computing Laboratory); Stan Tomov and Jack Dongarra (University of Tennessee); and Nicholas Higham (University of Manchester, School of Mathematics)
Abstract
pdf
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Harshitha Menon (Lawrence Livermore National Laboratory); Michael O. Lam (James Madison University, Lawrence Livermore National Laboratory); and Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger (Lawrence Livermore National Laboratory)
Abstract
pdf
Paper · Applications, Graph Algorithms, Security, Tech Program Reg Pass
Graph Algorithms and Systems
iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees
Yuede Ji (George Washington University); Hang Liu (University of Massachusetts, Lowell); and H. Howie Huang (George Washington University)
Abstract
pdf
Adaptive Anonymization of Data with b-Edge Covers
Arif Khan (Pacific Northwest National Laboratory), Krzysztof Choromanski (Google LLC), Alex Pothen and S M Ferdous (Purdue University), and Mahantesh Halappanavar and Antonino Tumeo (Pacific Northwest National Laboratory)
Abstract
pdf
faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU
Martin Winter and Daniel Mlakar (Graz University of Technology); Rhaleb Zayer and Hans-Peter Seidel (Max Planck Institute for Informatics); and Markus Steinberger (Graz University of Technology, Max Planck Institute for Informatics)
Abstract
pdf
Paper · Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass
Programming Systems Tools
Dynamic Data Race Detection for OpenMP Programs
Yizi Gu and John Mellor-Crummey (Rice University)
Abstract
pdf
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Kazem Cheshmi (University of Toronto), Shoaib Kamil (Adobe Research), Michelle Mills Strout (University of Arizona), and Maryam Mehri Dehnavi (University of Toronto)
Abstract
pdf
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Fangke Ye, Jisheng Zhao, and Vivek Sarkar (Georgia Institute of Technology)
Abstract
pdf
Paper · Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass
Deep Learning
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Randall Pittman, Hui Guan, and Xipeng Shen (North Carolina State University) and Seung-Hwan Lim and Robert M. Patton (Oak Ridge National Laboratory)
Abstract
pdf
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya (Intel Corporation); Deborah Bard (National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory); Pete Mendygral (Cray Inc); Lawrence Meadows (Intel Corporation); James Arnemann (University of California, Berkeley); Lei Shao (Intel Corporation); Siyu He (Carnegie Mellon University); Tuomas Karna (Intel Corporation); Diana Moise (Cray Inc); Simon J. Pennycook (Intel Corporation); Kristyn Maschhoff (Cray Inc); Jason Sewall and Nalini Kumar (Intel Corporation); Shirley Ho (Lawrence Berkeley National Laboratory, Carnegie Mellon University); Michael F. Ringenburg (Cray Inc); Mr Prabhat (Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC)); and Victor Lee (Intel Corporation)
Abstract
pdf
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke (Intel Corporation)
Abstract
pdf
Paper · Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass
Resilience III: GPUs
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Abdulrahman Mahmoud (University of Illinois) and Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen W. Keckler (Nvidia Corporation)
Abstract
pdf
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
Jieyang Chen, Hongbo Li, Sihuan Li, and Xin Liang (University of California, Riverside); Panruo Wu (University of Houston); Dingwen Tao (University of Alabama); Kaiming Ouyang, Yuanlai Liu, and Kai Zhao (University of California, Riverside); Qiang Guan (Kent State University); and Zizhong Chen (University of California, Riverside)
Abstract
pdf
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Charu Kalra, Fritz Previlon, and Xiangyu Li (Northeastern University); Norman Rubin (Nvidia Corporation); and David Kaeli (Northeastern University)
Abstract
pdf
Paper · Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass
Astrophysics Applications
Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows
Muhammad Nufail Farooqi (Koc University); Tan Nguyen, Weiqun Zhang, Ann S. Almgren, and John Shalf (Lawrence Berkeley National Laboratory); and Didem Unat (Koc University)
Abstract
pdf
Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver
Jia Shi (Rice University), Ruipeng Li (Lawrence Livermore National Laboratory), Yuanzhe Xi and Yousef Saad (University of Minnesota), and Maarten V. de Hoop (Rice University)
Abstract
pdf
Paper · Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass
File Systems: Data Movement and Provenance
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Devarshi Ghoshal, Lavanya Ramakrishnan, and Deborah Agarwal (Lawrence Berkeley National Laboratory)
Abstract
pdf
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
Pradeep Subedi, Philip Davis, and Shaohua Duan (Rutgers University); Scott Klasky (Oak Ridge National Laboratory); Hemanth Kolla (Sandia National Laboratories); and Manish Parashar (Rutgers University)
Abstract
pdf
A Year in the Life of a Parallel File System
Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Shane Snyder (Argonne National Laboratory), Teng Wang and Suren Byna (Lawrence Berkeley National Laboratory), Philip Carns (Argonne National Laboratory), and Nicholas J. Wright (Lawrence Berkeley National Laboratory)
Abstract
pdf

Other
ACM Gordon Bell Finalist
Gordon Bell Prize Finalist #1
A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing
Tsuyoshi Ichimura, Kohei Fujita, and Takuma Yamaguchi (University of Tokyo); Akira Naruse (Nvidia Corporation); Jack C. Wells (Oak Ridge National Laboratory); Thomas C. Schulthess (Swiss National Supercomputing Centre); Tjerk P. Straatsma and Christopher J. Zimmer (Oak Ridge National Laboratory); Maxime Martinasso (Swiss National Supercomputing Centre); and Kengo Nakajima, Muneo Hori, and Lalith Maddegedara (University of Tokyo)
Abstract
pdf
167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation
Robert M. Patton, J. Travis Johnston, Steven R. Young, Catherine D. Schuman, Don D. March, Thomas E. Potok, Derek C. Rose, Seung-Hwan Lim, Thomas P. Karnowski, Maxim A. Ziatdinov, and Sergei V. Kalinin (Oak Ridge National Laboratory)
Abstract
pdf
Exascale Deep Learning for Climate Analytics
Thorsten Kurth (Lawrence Berkeley National Laboratory), Sean Treichler and Joshua Romero (Nvidia Corporation), Mayur Mudigonda (Lawrence Berkeley National Laboratory), Nathan Luehr and Everett Phillips (Nvidia Corporation), Ankur Mahesh (Lawrence Berkeley National Laboratory), Michael Matheson (Oak Ridge National Laboratory), Jack Deslippe (Lawrence Berkeley National Laboratory), Massimiliano Fatica (Nvidia Corporation), Mr Prabhat (Lawrence Berkeley National Laboratory), and Michael Houston (Nvidia Corporation)
Abstract
pdf
ACM Gordon Bell Finalist
Gordon Bell Prize Finalist #2
Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing
Evan Berkowitz (Forschungszentrum Juelich); M.A. Clark (Nvidia Corporation); Arjun Gambhir (Lawrence Livermore National Laboratory, Lawrence Berkeley National Laboratory); Ken McElvain (University of California, Berkeley; Lawrence Berkeley National Laboratory); Amy Nicholson (University of North Carolina); Enrico Rinaldi (RIKEN BNL Research Center, Lawrence Berkeley National Laboratory); Pavlos Vranas (Lawrence Livermore National Laboratory, Lawrence Berkeley National Laboratory); André Walker-Loud (Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory); Chia Cheng Chang (Lawrence Berkeley National Laboratory, RIKEN); Bálint Joó (Thomas Jefferson National Accelerator Facility); Thorsten Kurth (Lawrence Berkeley National Laboratory); and Kostas Orginos (College of William & Mary, Thomas Jefferson National Accelerator Facility)
Abstract
pdf
ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds
Heng Lin (Tsinghua University, Fma Technology); Xiaowei Zhu (Tsinghua University, Qatar Computing Research Institute); Bowen Yu (Tsinghua University); Xiongchao Tang (Tsinghua University, Qatar Computing Research Institute); Wei Xue and Wenguang Chen (Tsinghua University); Lufei Zhang (State Key Laboratory of Mathematical Engineering and Advanced Computing); Torsten Hoefler (ETH Zurich); Xiaosong Ma (Qatar Computing Research Institute); Xin Liu (National Research Centre of Parallel Computer Engineering and Technology); Weimin Zheng (Tsinghua University); and Jingfang Xu (Beijing Sogou Technology Development Company)
Abstract
pdf
Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction
Wayne Joubert (Oak Ridge National Laboratory); Deborah Weighill (Oak Ridge National Laboratory, University of Tennessee); David Kainer (Oak Ridge National Laboratory); Sharlee Climer (University of Missouri, St Louis); Amy Justice (Yale University, US Department of Veterans Affairs); Kjiersten Fagnan (Lawrence Berkeley National Laboratory, US Department of Energy Joint Genome Institute); and Daniel Jacobson (Oak Ridge National Laboratory)
Abstract
pdf

Created 2018-10-17 20:24