GCRIS

Now showing 1 - 3 of 3

Evaluating Performance and Reliability of Selective Redundant Multithreading for Gpgpu Applications
(CEUR-WS, 2021) Kaya,E.; Karadaş,O.F.; Öz,I.
With the widespread use of GPU architectures in general-purpose computations, evaluating the soft error vulnerability of GPGPU programs and employing efficient fault tolerance techniques for more reliable execution becomes more prominent. Performing full redundancy, based on the redundant execution of the complete program, results in resource consumption and performance loss as well as energy inefficiency. Therefore, determining the most error-prone regions of the target program code and replicating only those parts maintains both high performance and acceptable error rates. In this study, we propose a partial redundant multithreading mechanism based on the soft error vulnerability of GPGPU applications and perform a trade-off analysis between performance and reliability. Firstly, we perform fault injection experiments to evaluate the SDC rates for each kernel function. Then, based on the outcome of the fault injection experiments, we determine the kernel function to-be-replicated. According to the pragmas denoting the redundancy points in the source code, our custom LLVM pass generates the code that enables the redundant execution for the specified code region. We evaluate both the reliability and performance of the redundant execution scenarios measuring the execution time of the redundant program generated by our compiler-managed redundancy technique. Our results demonstrate that protecting only the most vulnerable kernel functions enables high reliability without hurting the performance significantly. © 2021 The Authors.
Citation - WoS: 1
Citation - Scopus: 1
Gpprmon: Gpu Runtime Memory Performance and Power Monitoring Tool
(Springer Science and Business Media Deutschland GmbH, 2024) Topçu,B.; Öz,I.
Graphics Processing Units (GPUs) perform highly efficient parallel execution for high-performance computation and embedded system domains. While performance concerns drive the main optimization efforts, power issues become important for energy-efficient GPU executions. While performance profilers and architectural simulators offer statistics about the target execution, they either present only performance metrics in a coarse kernel function level or lack visualization support that enables performance bottleneck analysis or performance-power consumption comparison. Evaluating both performance and power consumption dynamically at runtime and across GPU memory components enables a comprehensive tradeoff analysis for GPU architects and software developers. This paper presents a novel memory performance and power monitoring tool for GPU programs, GPPRMon, which performs a systematic metric collection and offers useful visualization views to track power and performance optimizations. Our simulation-based framework dynamically collects microarchitectural metrics by monitoring individual instructions and reports achieved performance and power consumption information at runtime. Our visualization interface presents spatial and temporal views of the execution. While the first demonstrates the performance and power metrics across GPU memory components, the latter shows the corresponding information at the instruction granularity in a timeline. Our case study reveals the potential usages of our tool in bottleneck identification and power consumption for a memory-intensive graph workload. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
Compiler-Managed Replication of Cuda Kernels for Reliable Execution of Gpgpu Applications
(World Scientific, 2024) Kaya,E.; Öz,I.
As Graphics Processing Units (GPUs) evolve for general-purpose computations besides inherently fault-tolerant graphics programs, soft error reliability becomes a first-class citizen in program design. Especially, safety-critical systems utilizing GPU devices need to employ fault-tolerance techniques to recover from errors in hardware components. While software-level redundancy approaches, based on the replication of the application code, offer high reliability for safe program execution, it is essential to perform redundancy by utilizing parallel execution units in the target architecture not to hurt performance with redundant computations. In this work, we propose redundancy approaches using the parallel GPU cores and implement a compiler-level redundancy framework that enables the programmer to configure the target GPGPU program for redundant execution. We run redundant executions for GPGPU programs from the PolyBench benchmark suite by applying our kernel-level redundancy approaches and evaluate their performance by considering the parallelism level of the programs. Our results reveal that redundancy approaches utilizing parallelism offered by GPU cores yield higher performance for redundant executions, while the programs that already make use of parallel GPU cores in their original form suffer from overhead caused by contention among redundant threads. © World Scientific Publishing Company.

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Browse

Filters

Settings

Sort By

Results per page

Search Results