SC18 Proceedings


Overview | By Event Type | By Tag | Author Index

Tuesday, November 13th


10:30am-12:00pm

Data and Storage
C146
SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
BESPOKV: Application Tailored Scale-Out Key-Value Stores
Scaling Embedded In Situ Indexing with DeltaFS
Paper
Clouds and Distributed Computing, File Systems, I/O, Storage, Tech Program Reg Pass

Next-Generation Networking
C140/142
Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage
Fine-Grained, Multi-Domain Network Resource Abstraction as a Fundamental Primitive to Enable High-Performance, Collaborative Data Sciences
Light-Weight Protocols for Wire-Speed Ordering
Paper
Architectures, Data Analytics, Networks, Tech Program Reg Pass

Resilience
C141/143/149
GPU Age-Aware Scheduling to Improve the Reliability of Leadership Jobs on Titan
FlipTracker: Understanding Natural Error Resilience in HPC Applications
Doomsday: Predicting Which Node Will Fail When on Supercomputers
Paper
GPUs, Resiliency, State of the Practice, System Software, Tech Program Reg Pass

1:30pm-3:00pm

Biology Applications
C140/142
Extreme Scale De Novo Metagenome Assembly
Optimizing High Performance Distributed Memory Parallel Hash Tables for DNA k-mer Counting
Redesigning LAMMPS for Petascale and Hundred-Billion-Atom Simulation on Sunway TaihuLight
Paper
Algorithms, Applications, Computational Biology, Scientific Computing, Tech Program Reg Pass

Large-Scale Algorithms
C146
Large-Scale Hierarchical K-Means for Heterogeneous Many-Core Supercomputers
TriCore: Parallel Triangle Counting on GPUs
Distributed-Memory Hierarchical Compression of Dense SPD Matrices
Paper
Algorithms, Architectures, Data Analytics, Deep Learning, Networks, Scientific Computing, Visualization, Tech Program Reg Pass

Performance and Energy Analysis
C141/143/149
A Parallelism Profiler with What-If Analyses for OpenMP Programs
Energy Efficiency Modeling of Parallel Applications
HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor
Paper
OpenMP, Performance, Power, Tools, Tech Program Reg Pass

3:30pm-5:00pm

Algorithms on Sparse Data
C141/143/149
HiCOO: Hierarchical Storage of Sparse Tensors
Distributed Memory Sparse Inverse Covariance Matrix Estimation on High-Performance Computing Architectures
PruneJuice: Pruning Trillion-Edge Graphs to a Precise Pattern-Matching Solution
Paper
Algorithms, Graph Algorithms, Linear Algebra, Machine Learning, Sparse Computation, Tech Program Reg Pass

Performance Optimization Studies
C146
Many-Core Graph Workload Analysis
Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading
Topology-Aware Space-Shared Co-Analysis of Large-Scale Molecular Dynamics Simulations
Paper
Data Analytics, Performance, Programming Systems, Storage, Tools, Visualization, Tech Program Reg Pass

Resource Management and Interference
C140/142
RM-Replay: A High-Fidelity Tuning, Optimization and Exploration Tool for Resource Management
Evaluation of an Interference-Free Node Allocation Policy on Fat-Tree Clusters
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
Paper
Networks, Resource Management, Scheduling, State of the Practice, System Software, Tech Program Reg Pass

Wednesday, November 14th


10:30am-12:00pm

MPI Optimization and Characterization
C140/142
Cooperative Rendezvous Protocols for Improved Performance and Overlap
Framework for Scalable Intra-Node Collective Operations Using Shared Memory
Characterization of MPI Usage on a Production Supercomputer
Paper
Architectures, MPI, Networks, Performance, Programming Systems, State of the Practice, Tech Program Reg Pass

Non-Volatile Memory
C141/143/149
Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task-Parallel Programs
DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access
Siena: Exploring the Design Space of Heterogeneous Memory Systems
Paper
GPUs, Memory, NVRAM, Performance, System Software, Tools, Tech Program Reg Pass

Task-Based Programming
C146
Dynamic Tracing: Memoization of Task Graphs for Dynamic Task-Based Runtimes
Runtime-Assisted Cache Coherence Deactivation in Task Parallel Programs
A Divide and Conquer Algorithm for DAG Scheduling Under Power Constraints
Paper
Algorithms, Architectures, Memory, Networks, Parallel Programming Languages, Libraries, and Models, Power, Programming Systems, Scheduling, Tech Program Reg Pass

1:30pm-3:00pm

Clouds and Distributed Computing
C141/143/149
A Reference Architecture for Datacenter Scheduling: Design, Validation, and Experiments
Dynamically Negotiating Capacity Between On-Demand and Batch Clusters
A Lightweight Model for Right-Sizing Master-Worker Applications
Paper
Clouds and Distributed Computing, Resource Management, Scheduling, Tech Program Reg Pass

Physics and Tensor Applications
C140/142
Simulating the Wenchuan Earthquake with Accurate Surface Topography on Sunway TaihuLight
Accelerating Quantum Chemistry with Vectorized and Batched Integrals
High-Performance Dense Tucker Decomposition on GPU Clusters
Paper
Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

Resilience II
C146
Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo
Partial Redundancy in HPC Systems with Non-Uniform Node Reliabilities
Evaluating and Accelerating High-Fidelity Error Injection for HPC
Paper
Performance, Resiliency, Tools, Tech Program Reg Pass

3:30pm-5:00pm

Arithmetic and Optimization
C141/143/149
Associative Instruction Reordering to Alleviate Register Pressure
Harnessing GPU's Tensor Cores Fast FP16 Arithmetic to Speedup Mixed-Precision Iterative Refinement Solvers
ADAPT: Algorithmic Differentiation Applied to Floating-Point Precision Tuning
Paper
Algorithms, Applications, Architectures, Compiler Analysis and Optimization, Floating Point, Performance, Precision, Programming Systems, Tools, Tech Program Reg Pass

Gordon Bell Prize Finalist #1
A2 Ballroom
A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing
167-PFlops Deep Learning for Electron Microscopy: From Learning Physics to Atomic Manipulation
Exascale Deep Learning for Climate Analytics
ACM Gordon Bell Finalist

Large Scale System Deployments
C140/142
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
Best Practices and Lessons from Deploying and Operating a Sustained-Petascale System: The Blue Waters Experience
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Paper
Architectures, Networks, Performance, Scientific Computing, State of the Practice, Tools, Tech Program Reg Pass

Thursday, November 15th


10:30am-12:00pm

Gordon Bell Prize Finalist #2
A2 Ballroom
Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing
ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds
Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction
ACM Gordon Bell Finalist

Graph Algorithms and Systems
C140/142
iSpan: Parallel Identification of Strongly Connected Components with Spanning Trees
Adaptive Anonymization of Data with b-Edge Covers
faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU
Paper
Applications, Graph Algorithms, Security, Tech Program Reg Pass

Programming Systems Tools
C141/143/149
Dynamic Data Race Detection for OpenMP Programs
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Detecting MPI Usage Anomalies via Partial Program Symbolic Execution
Paper
Linear Algebra, Memory, MPI, OpenMP, Programming Systems, Tools, Tech Program Reg Pass

1:30pm-3:00pm

Deep Learning
C140/142
Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures
Paper
Applications, Cosmology, Data Analytics, Deep Learning, Machine Learning, Programming Systems, Storage, Visualization, Tech Program Reg Pass

Resilience III: GPUs
C141/143/149
Optimizing Software-Directed Instruction Replication for GPU Error Detection
Fault Tolerant One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs
PRISM: Predicting Resilience of GPU Applications Using Statistical Methods
Paper
Algorithms, Architectures, GPUs, Linear Algebra, Networks, Resiliency, Tech Program Reg Pass

3:30pm-5:00pm

Astrophysics Applications
C140/142
Phase Asynchronous AMR Execution for Productive and Performant Astrophysical Flows
Computing Planetary Interior Normal Modes with a Highly Parallel Polynomial Filtering Eigensolver
Paper
Algorithms, Applications, Computational Physics, Scientific Computing, Tech Program Reg Pass

File Systems: Data Movement and Provenance
C141/143/149
Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In Situ Workflows
A Year in the Life of a Parallel File System
Paper
Architectures, Data Management, File Systems, Networks, State of the Practice, System Software, Workflows, Tech Program Reg Pass

Created 2018-10-17 20:24