SC18 Proceedings

A

Abramson, David · more

Energy Efficiency Modeling of Parallel Applications · pdf

Agarwal, Deborah · more

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems · pdf

Aiken, Alex · more

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes · pdf

Alam, Sadaf R. · more

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management · pdf

Almgren, Ann S. · more

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows · pdf

Aluru, Srinivas · more

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting · pdf

Alvarez, Lluc · more

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs · pdf

Amer, Abdelhalim · more

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading · pdf

Amvrosiadis, George · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Andreadis, Georgios · more

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments · pdf

Anwar, Ali · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Aoyama, Toshikazu · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Appelhans, David · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Arndt, Bill · more

Extreme Scale De Novo Metagenome Assembly · pdf

Arnemann, James · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Atchley, Scott · more

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan · pdf
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Avancha, Sasikanth · more

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures · pdf

Return to Top

B

Baden, Scott · more

Doomsday: Predicting Which Node Will Fail When on Supercomputers · pdf

Balaji, Pavan · more

Characterization of MPI Usage on a Production Supercomputer · pdf
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading · pdf

Balmana, Marc Gamell · more

Framework for Scalable Intra-Node Collective Operations Using Shared Memory · pdf

Banerjee, Kunal · more

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures · pdf

Bard, Deborah · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Baseman, Elisabeth · more

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo · pdf

Bauer, Gregory H. · more

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience · pdf

Bauer, Michael · more

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes · pdf

Bayatpour, M. · more

Cooperative Rendezvous Protocols for Improved Performance and Overlap · pdf

Belviranli, Mehmet E. · more

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access · pdf

Berkowitz, Evan · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Bertsch, Adam · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Bhatele, Abhinav · more

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters · pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing · pdf

Bianco, Mauro · more

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management · pdf

Biros, George · more

Distributed-Memory Hierarchical Compression of Dense SPD Matrices · pdf

Blackmore, Robert · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Bland, Arthur S. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Blumrich, Matthias A. · more

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage · pdf

Bode, Brett · more

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience · pdf

Bollhöfer, Matthias · more

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures · pdf

Boushehrinejadmoradi, Nader · more

A Parallelism Profiler with What-If Analyses for OpenMP Programs · pdf

Buluc, Aydin · more

Extreme Scale De Novo Metagenome Assembly · pdf

Byna, Suren · more

A Year in the Life of a Parallel File System · pdf

Return to Top

C

Caheny, Paul · more

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs · pdf

Carns, Philip · more

A Year in the Life of a Parallel File System · pdf

Casas, Marc · more

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs · pdf

Casses, Ben · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Chakaravarthy, Venkatesan · more

High-Performance Dense Tucker Decomposition on GPU Clusters · pdf

Chakraborty, S. · more

Cooperative Rendezvous Protocols for Improved Performance and Overlap · pdf

Chambreau, Chris · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Chang, Chia Cheng · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Chang, Chun-Kai · more

Evaluating and Accelerating High-Fidelity Error Injection for HPC · pdf

Chen, Bingwei · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Chen, Dexun · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Chen, Jieyang · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Chen, Wenguang · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Chen, Xiaofei · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Chen, Zizhong · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Cheng, Yue · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Cheshmi, Kazem · more

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism · pdf

Chochia, George · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Choi, Jee · more

High-Performance Dense Tucker Decomposition on GPU Clusters · pdf

Choromanski, Krzysztof · more

Adaptive Anonymization of Data with b-Edge Covers · pdf

Chow, Edmond · more

Accelerating Quantum Chemistry with Vectorized and Batched Integrals · pdf

Chunduri, Sudheer · more

Characterization of MPI Usage on a Production Supercomputer · pdf

Clark, M.A. · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Climer, Sharlee · more

Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction · pdf

Cranor, Charles D. · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Cromey, Clara E. · more

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing · pdf

Return to Top

D

Das, Anwesha · more

Doomsday: Predicting Which Node Will Fail When on Supercomputers · pdf

Davis, Philip · more

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows · pdf

Davison, Gene · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

de Hoop, Maarten V. · more

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver · pdf

de Supinski, Bronis R. · more

Energy Efficiency Modeling of Parallel Applications · pdf
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

DeBardeleben, Nathan · more

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo · pdf

Demirci, Gökalp · more

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints · pdf

Dennison, Larry · more

Light-Weight Protocols for Wire-Speed Ordering · pdf

Dennison, Larry R. · more

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage · pdf

DeRose, Luiz · more

Energy Efficiency Modeling of Parallel Applications · pdf

Deslippe, Jack · more

Exascale Deep Learning for Climate Analytics · pdf

Dinh, Minh Ngoc · more

Energy Efficiency Modeling of Parallel Applications · pdf

Domke, Jens · more

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing · pdf

Dongarra, Jack · more

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers · pdf

Douglis, Fred · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Du Bois, Kristof · more

Many-Core Graph Workload Analysis · pdf

Duan, Shaohua · more

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows · pdf

Duan, Xiaohui · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Durnov, Dmitry · more

Framework for Scalable Intra-Node Collective Operations Using Shared Memory · pdf

Return to Top

E

Eberle, Hans · more

Light-Weight Protocols for Wire-Speed Ordering · pdf

Eftekhari, Aryan · more

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures · pdf

Egan, Rob · more

Extreme Scale De Novo Metagenome Assembly · pdf

Endrei, Mark · more

Energy Efficiency Modeling of Parallel Applications · pdf

Enos, Jeremy · more

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience · pdf

Erez, Mattan · more

Evaluating and Accelerating High-Fidelity Error Injection for HPC · pdf

Eyerman, Stijn · more

Many-Core Graph Workload Analysis · pdf

Ezell, Matthew A. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Return to Top

F

Fagnan, Kjiersten · more

Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction · pdf

Farooqi, Muhammad Nufail · more

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows · pdf

Fatica, Massimiliano · more

Exascale Deep Learning for Climate Analytics · pdf

Ferdous, S M · more

Adaptive Anonymization of Data with b-Edge Covers · pdf

Ferreira, Kurt B. · more

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo · pdf

Fryman, Joshua B. · more

Many-Core Graph Workload Analysis · pdf

Fu, Haohuan · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Fujita, Kohei · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Return to Top

G

Gambhir, Arjun · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Gan, Lin · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Ganger, Gregory R. · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Gao, Ping · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Garland, Michael · more

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes · pdf

Garzaran, Maria · more

Framework for Scalable Intra-Node Collective Operations Using Shared Memory · pdf

Geist, Al · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Georganas, Evangelos · more

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures · pdf
Extreme Scale De Novo Metagenome Assembly · pdf

Ghoshal, Devarshi · more

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems · pdf

Gibson, Garth A. · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Gila, Miguel · more

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management · pdf

Goldstone, Robin · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Goltsman, Eugene · more

Extreme Scale De Novo Metagenome Assembly · pdf

Gonsiorowski, Elsa · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Gooding, Tom · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Grider, Gary · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Grinberg, Leopold · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Gu, Yizi · more

Dynamic Data Race Detection for OpenMP Programs · pdf

Guan, Hui · more

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines · pdf

Guan, Qiang · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Guo, Danhao · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Guo, Fan · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Guo, Luanzheng · more

FlipTracker: Understanding Natural Error Resilience in HPC Applications · pdf

Guok, Chin · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Return to Top

H

Haidar, Azzam · more

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers · pdf

Halappanavar, Mahantesh · more

Adaptive Anonymization of Data with b-Edge Covers · pdf

Han, Jingoo · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Hanson, Bill · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Hargrove, Paul · more

Doomsday: Predicting Which Node Will Fail When on Supercomputers · pdf

Hari, Siva Kumar Sastry · more

Optimizing Software-Directed Instruction Replication for GPU Error Detection · pdf

Harms, Kevin · more

Characterization of MPI Usage on a Production Supercomputer · pdf

Hartner, Bill · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Hashmi, J. · more

Cooperative Rendezvous Protocols for Improved Performance and Overlap · pdf

He, Conghui · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

He, Siyu · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Heinecke, Alexander · more

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures · pdf

Heirman, Wim · more

Many-Core Graph Workload Analysis · pdf

Henry, Greg · more

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures · pdf

Herbein, Stephen · more

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters · pdf

Higham, Nicholas · more

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers · pdf

Hittinger, Jeffrey · more

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning · pdf

Ho, Shirley · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Hoefler, Torsten · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Hoffmann, Henry · more

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints · pdf

Hofmeyr, Steven · more

Extreme Scale De Novo Metagenome Assembly · pdf

Hori, Muneo · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Houston, Michael · more

Exascale Deep Learning for Climate Analytics · pdf

Hu, Yang · more

TriCore: Parallel Triangle Counting on GPUs · pdf

Huang, H. Howie · more

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees · pdf
TriCore: Parallel Triangle Counting on GPUs · pdf

Huang, Hai · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Huang, Hua · more

Accelerating Quantum Chemistry with Vectorized and Batched Integrals · pdf

Huang, Renfei · more

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition · pdf

Hur, Ibrahim · more

Many-Core Graph Workload Analysis · pdf

Hussain, Zaeem · more

Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities · pdf

Return to Top

I

Ichimura, Tsuyoshi · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Iosup, Alexandru · more

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments · pdf

Isobe, Yoko · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Iwasaki, Shintaro · more

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading · pdf

Return to Top

J

Jacobson, Daniel · more

Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction · pdf

Jain, Nikhil · more

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters · pdf
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing · pdf

Jain, Surabhi · more

Framework for Scalable Intra-Node Collective Operations Using Shared Memory · pdf

Ji, Yuede · more

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees · pdf

Jiang, Nan · more

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage · pdf

Jin, Chao · more

Energy Efficiency Modeling of Parallel Applications · pdf

Johnston, J. Travis · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Joubert, Wayne · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf
Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction · pdf

Joó, Bálint · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Justice, Amy · more

Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction · pdf

Return to Top

K

Kaeli, David · more

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods · pdf

Kahle, Jim · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Kainer, David · more

Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction · pdf

Kalamkar, Dhiraj · more

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures · pdf

Kaleem, Rashid · more

Framework for Scalable Intra-Node Collective Operations Using Shared Memory · pdf

Kalinin, Sergei V. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Kalra, Charu · more

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods · pdf

Kamil, Shoaib · more

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism · pdf

Karlin, Ian · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Karna, Tuomas · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Karnowski, Thomas P. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Keahey, Kate · more

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters · pdf

Keckler, Stephen W. · more

Optimizing Software-Directed Instruction Replication for GPU Error Detection · pdf

Kelly, Nicholas · more

Evaluating and Accelerating High-Fidelity Error Injection for HPC · pdf

Khan, Arif · more

Adaptive Anonymization of Data with b-Edge Covers · pdf

Klasky, Scott · more

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows · pdf

Knight, Christopher · more

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations · pdf

Kobayashi, Hiroaki · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Kolla, Hemanth · more

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows · pdf

Komatsu, Kazuhiko · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Kramer, William T. · more

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience · pdf

Kremer-Herman, Nathaniel · more

A Lightweight Model for Right-Sizing Master-Worker Applications · pdf

Kumar, Nalini · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Kumaran, Kalyan · more

Characterization of MPI Usage on a Production Supercomputer · pdf

Kurth, Thorsten · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf
Exascale Deep Learning for Climate Analytics · pdf

Return to Top

L

Laguna, Ignacio · more

FlipTracker: Understanding Natural Error Resilience in HPC Applications · pdf

Lam, Michael O. · more

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning · pdf

Langer, Akhil · more

Framework for Scalable Intra-Node Collective Operations Using Shared Memory · pdf

Lathrop, Scott · more

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience · pdf

Le, Franck · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Lee, Dongyoon · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Lee, Seyong · more

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access · pdf

Lee, Victor · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Lee, Wonchan · more

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes · pdf

Leininger, Matthew L. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Letaief, Khaled Ben · more

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition · pdf

Leverman, Dustin · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Levy, Scott · more

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo · pdf

Li, Dong · more

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs · pdf
FlipTracker: Understanding Natural Error Resilience in HPC Applications · pdf

Li, Hongbo · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Li, Jiajia · more

HiCOO: Hierarchical Storage of Sparse Tensors · pdf

Li, Liandeng · more

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf

Li, Ruipeng · more

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver · pdf

Li, Sihuan · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Li, Xiangyu · more

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods · pdf

Li, Yuxuan · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Liang, Xin · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Lim, Seung-Hwan · more

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines · pdf
167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Lin, Heng · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Liu, Feng · more

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters · pdf

Liu, Hang · more

iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees · pdf
TriCore: Parallel Triangle Counting on GPUs · pdf

Liu, Weiguo · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Liu, Xin · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Liu, Xing · more

High-Performance Dense Tucker Decomposition on GPU Clusters · pdf

Liu, Y. Jace · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Liu, Yuanlai · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Lloyd, Scott · more

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning · pdf

Lockwood, Glenn K. · more

A Year in the Life of a Parallel File System · pdf

Lowenthal, David K. · more

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing · pdf

Luehr, Nathan · more

Exascale Deep Learning for Climate Analytics · pdf

Lym, Sangkug · more

Evaluating and Accelerating High-Fidelity Error Injection for HPC · pdf

Return to Top

M

Ma, Xiaosong · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

MacAuley, John · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Maddegedara, Lalith · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Mahesh, Ankur · more

Exascale Deep Learning for Climate Analytics · pdf

Mahmoud, Abdulrahman · more

Optimizing Software-Directed Instruction Replication for GPU Error Detection · pdf

Malakar, Preeti · more

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations · pdf

March, Don D. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Marincic, Ivana · more

A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints · pdf

Markthub, Pak · more

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access · pdf

Marroquin, Chris · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Martinasso, Maxime · more

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management · pdf
A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Maschhoff, Kristyn · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Mastenbroek, Fabian · more

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments · pdf

Matheson, Michael · more

Exascale Deep Learning for Climate Analytics · pdf

Mathuriya, Amrita · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Matsuoka, Satoshi · more

DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access · pdf

Maxwell, Don · more

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan · pdf

Maxwell, Don E. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

McCalpin, John D. · more

HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor · pdf

McElvain, Ken · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

McMurtrie, Colin · more

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management · pdf

McNally, Stephen · more

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan · pdf

Meadows, Lawrence · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Mehri Dehnavi, Maryam · more

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism · pdf

Melhem, Rami · more

Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities · pdf

Mellor-Crummey, John · more

Dynamic Data Race Detection for OpenMP Programs · pdf

Mendes, Celso L. · more

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience · pdf

Mendygral, Pete · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Meng, Xiangxu · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Menon, Harshitha · more

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning · pdf

Mills Strout, Michelle · more

ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism · pdf

Misra, Sanchit · more

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting · pdf

Mlakar, Daniel · more

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU · pdf

Mohror, Kathryn · more

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning · pdf

Moise, Diana · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Momose, Shintaro · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Moody, Adam · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Moretó, Miquel · more

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs · pdf

Mudigonda, Mayur · more

Exascale Deep Learning for Climate Analytics · pdf

Mueller, Frank · more

Doomsday: Predicting Which Node Will Fail When on Supercomputers · pdf

Munson, Todd · more

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations · pdf

Musa, Akihiro · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Return to Top

N

Nagarakatte, Santosh · more

A Parallelism Profiler with What-If Analyses for OpenMP Programs · pdf

Nakajima, Kengo · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Naruse, Akira · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Newman, Harvey · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Nguyen, Tan · more

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows · pdf

Nicholson, Amy · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Return to Top

O

Ohmacht, Martin · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Oliker, Leonid · more

Extreme Scale De Novo Metagenome Assembly · pdf

Oral, Sarp H. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Orginos, Kostas · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Osei-Kuffuor, Daniel · more

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning · pdf

Ouyang, Kaiming · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Return to Top

P

Pabst, Hans · more

Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures · pdf

Pan, Tony C. · more

Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting · pdf

Panda, D. K. · more

Cooperative Rendezvous Protocols for Improved Performance and Overlap · pdf

Pankajakshan, Ramesh · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Papka, Michael E. · more

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations · pdf

Parashar, Manish · more

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows · pdf

Parker, Scott · more

Characterization of MPI Usage on a Production Supercomputer · pdf

Patton, Robert M. · more

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines · pdf
167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Pearce, Roger · more

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution · pdf

Peng, Ivy B. · more

Siena: Exploring the Design Space of Heterogeneous Memory Systems · pdf

Pennycook, Simon J. · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Phillips, Everett · more

Exascale Deep Learning for Climate Analytics · pdf

Pittman, Randall · more

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines · pdf

Pizzano, Fernando · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Pollard, Samuel D. · more

Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters · pdf

Pothen, Alex · more

Adaptive Anonymization of Data with b-Edge Covers · pdf

Potok, Thomas E. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Pouchet, Louis-Noel · more

Associative Instruction Reordering to Alleviate Register Pressure · pdf

Poxon, Heidi · more

Energy Efficiency Modeling of Parallel Applications · pdf

Prabhat, Mr · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf
Exascale Deep Learning for Climate Analytics · pdf

Previlon, Fritz · more

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods · pdf

Return to Top

R

R. Butt, Ali · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Ramakrishnan, Lavanya · more

Dac-Man: Data Change Management for Scientific Datasets on HPC Systems · pdf

Rastello, Fabrice · more

Associative Instruction Reordering to Alleviate Register Pressure · pdf

Rawat, Prashant Singh · more

Associative Instruction Reordering to Alleviate Register Pressure · pdf

Reiz, Severin · more

Distributed-Memory Hierarchical Compression of Dense SPD Matrices · pdf

Ren, Jie · more

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs · pdf

Reza, Tahsin · more

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution · pdf

Rinaldi, Enrico · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Ringenburg, Michael F. · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Ripeanu, Matei · more

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution · pdf

Riteau, Pierre · more

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters · pdf

Rogers, James H. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Roman, Eric · more

Doomsday: Predicting Which Node Will Fail When on Supercomputers · pdf

Romero, Joshua · more

Exascale Deep Learning for Climate Analytics · pdf

Rose, Derek C. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Rosenburg, Bryan · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Rountev, Atanas · more

Associative Instruction Reordering to Alleviate Register Pressure · pdf

Rubin, Norman · more

PRISM: Predicting Resilience of GPU Applications Using Statistical Methods · pdf

Return to Top

S

Saad, Yousef · more

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver · pdf

Sadayappan, P. · more

Associative Instruction Reordering to Alleviate Register Pressure · pdf

Sanders, Geoffrey · more

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution · pdf

Sannikov, Alexander · more

Framework for Scalable Intra-Node Collective Operations Using Shared Memory · pdf

Sarkar, Vivek · more

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution · pdf

Sato, Masayuki · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Schenk, Olaf · more

Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures · pdf

Schmidt, Drew · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Schordan, Markus · more

ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning · pdf

Schulthess, Thomas C. · more

RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management · pdf
A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Schulz, Martin · more

FlipTracker: Understanding Natural Error Resilience in HPC Applications · pdf

Schuman, Catherine D. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Seidel, Hans-Peter · more

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU · pdf

Settlemyer, Bradley W. · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Sewall, Jason · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Sexton, James · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Shalf, John · more

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows · pdf

Shankar, Mallikarjun · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Shao, Lei · more

CosmoFlow: Using Deep Learning to Learn the Universe at Scale · pdf

Shen, Xipeng · more

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines · pdf

Shi, Jia · more

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver · pdf

Siddiqua, Taniya · more

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo · pdf

Sim, Hyogi · more

BESPOKV: Application Tailored Scale-Out Key-Value Stores · pdf

Sisneros, Roberto R. · more

Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience · pdf

Slaughter, Elliott · more

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes · pdf

Smith, Staci A. · more

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing · pdf

Snyder, Shane · more

A Year in the Life of a Parallel File System · pdf

Sridharan, Vilas · more

Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo · pdf

Steinberger, Markus · more

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU · pdf

Straatsma, Tjerk P. · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Subedi, Pradeep · more

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows · pdf

Subramoni, H. · more

Cooperative Rendezvous Protocols for Improved Performance and Overlap · pdf

Sukumaran-Rajam, Aravind · more

Associative Instruction Reordering to Alleviate Register Pressure · pdf

Sullivan, Michael B. · more

Evaluating and Accelerating High-Fidelity Error Injection for HPC · pdf
Optimizing Software-Directed Instruction Replication for GPU Error Detection · pdf

Sun, Jimeng · more

HiCOO: Hierarchical Storage of Sparse Tensors · pdf

Return to Top

T

Tan, Li · more

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf

Tang, Xiongchao · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Tao, Dingwen · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Taura, Kenjiro · more

Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading · pdf

Thain, Douglas · more

A Lightweight Model for Right-Sizing Master-Worker Applications · pdf

Thiagarajan, Jayaraman J. · more

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing · pdf

Thomson, John · more

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf

Tomov, Stan · more

Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers · pdf

Tovar, Benjamin · more

A Lightweight Model for Right-Sizing Master-Worker Applications · pdf

Treichler, Sean · more

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes · pdf
Exascale Deep Learning for Climate Analytics · pdf

Tripoul, Nicolas · more

PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution · pdf

Tritt, Andrew · more

Extreme Scale De Novo Metagenome Assembly · pdf

Tsai, Timothy · more

Optimizing Software-Directed Instruction Replication for GPU Error Detection · pdf

Tumeo, Antonino · more

Adaptive Anonymization of Data with b-Edge Covers · pdf

Return to Top

U

Unat, Didem · more

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows · pdf

Return to Top

V

Valero, Mateo · more

Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs · pdf

Vazhkudai, Sudharshan S. · more

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan · pdf
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Vergara Larrea, Veronica G. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Versluis, Laurens · more

A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments · pdf

Vetter, Jeffrey S. · more

Siena: Exploring the Design Space of Heterogeneous Memory Systems · pdf
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access · pdf

Vishwanath, Venkatram · more

Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations · pdf

Vranas, Pavlos · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Vuduc, Richard · more

HiCOO: Hierarchical Storage of Sparse Tensors · pdf

Return to Top

W

Walker-Loud, André · more

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing · pdf

Walkup, Bob · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Wan, Wubin · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Wang, Chenyu · more

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf

Wang, Feiyi · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Wang, Teng · more

A Year in the Life of a Parallel File System · pdf

Wang, Wei · more

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition · pdf

Wang, X. Tony · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Warszawski, Todd · more

Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes · pdf

Watanabe, Osamu · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Watson, Py · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Weems, Lance D. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Wei, Yanwen · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Weighill, Deborah · more

Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction · pdf

Weissman, Jon · more

Dynamically Negotiating Capacity Between On-Demand and Batch Clusters · pdf

Wells, Jack C. · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Winter, Martin · more

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU · pdf

Wright, Nicholas J. · more

A Year in the Life of a Parallel File System · pdf

Wu, Kai · more

Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs · pdf

Wu, Panruo · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Return to Top

X

Xi, Yuanzhe · more

Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver · pdf

Xiang, Qiao · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Xu, Jingfang · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Xue, Wei · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf
ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Return to Top

Y

Yamaguchi, Takuma · more

A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Yang, Guangwen · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Yang, Y. Richard · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Ye, Fangke · more

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution · pdf

Yelick, Katherine · more

Extreme Scale De Novo Metagenome Assembly · pdf

Yin, Junqi · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf

Yoga, Adarsh · more

A Parallelism Profiler with What-If Analyses for OpenMP Programs · pdf

Yokokawa, Mitsuo · more

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA · pdf

Young, Steven R. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Yu, Bowen · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Yu, Chenhan D. · more

Distributed-Memory Hierarchical Compression of Dense SPD Matrices · pdf

Yu, Teng · more

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf

Yu, Yinghao · more

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition · pdf

Return to Top

Z

Zayer, Rhaleb · more

faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU · pdf

Zhang, J. Jensen · more

Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences · pdf

Zhang, Jun · more

SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition · pdf

Zhang, Lufei · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Zhang, Meng · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Zhang, Tingjian · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Zhang, Wei · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf
Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Zhang, Weiqun · more

Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows · pdf

Zhang, Wenqiang · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Zhang, Wusheng · more

Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight · pdf

Zhang, Zhenguo · more

Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight · pdf

Zhao, Jisheng · more

Detecting MPI Usage Anomalies via Partial Program Symbolic Execution · pdf

Zhao, Kai · more

Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs · pdf

Zhao, Wenlai · more

Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers · pdf

Zheng, Qing · more

Scaling Embedded In Situ Indexing with DeltaFS · pdf

Zheng, Weimin · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Zhu, Xiaowei · more

ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds · pdf

Ziatdinov, Maxim A. · more

167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation · pdf

Zimmer, Christopher · more

GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan · pdf

Zimmer, Christopher J. · more

The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems · pdf
A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing · pdf

Znati, Taieb · more

Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities · pdf

Return to Top