Compiler-Managed Replication of Cuda Kernels for Reliable Execution of Gpgpu Applications

Loading...

Date

Journal Title

Journal ISSN

Volume Title

Open Access Color

Green Open Access

Yes

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

As Graphics Processing Units (GPUs) evolve for general-purpose computations besides inherently fault-tolerant graphics programs, soft error reliability becomes a first-class citizen in program design. Especially, safety-critical systems utilizing GPU devices need to employ fault-tolerance techniques to recover from errors in hardware components. While software-level redundancy approaches, based on the replication of the application code, offer high reliability for safe program execution, it is essential to perform redundancy by utilizing parallel execution units in the target architecture not to hurt performance with redundant computations. In this work, we propose redundancy approaches using the parallel GPU cores and implement a compiler-level redundancy framework that enables the programmer to configure the target GPGPU program for redundant execution. We run redundant executions for GPGPU programs from the PolyBench benchmark suite by applying our kernel-level redundancy approaches and evaluate their performance by considering the parallelism level of the programs. Our results reveal that redundancy approaches utilizing parallelism offered by GPU cores yield higher performance for redundant executions, while the programs that already make use of parallel GPU cores in their original form suffer from overhead caused by contention among redundant threads. © World Scientific Publishing Company.

Description

Kaya, Ercument/0000-0001-5073-8159; Oz, Isil/0000-0002-8310-1143

Keywords

compiler support, GPU computing, redundancy, soft errors

Fields of Science

0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology, 01 natural sciences

Citation

WoS Q

Scopus Q

OpenCitations Logo
OpenCitations Citation Count
N/A

Volume

33

Issue

14

Start Page

End Page

PlumX Metrics
Citations

Scopus : 0

Captures

Mendeley Readers : 1

Page Views

85

checked on May 01, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.0

Sustainable Development Goals

SDG data is not available