Compiler-Managed Replication of Cuda Kernels for Reliable Execution of Gpgpu Applications

dc.contributor.author Kaya,E.
dc.contributor.author Öz,I.
dc.date.accessioned 2024-05-05T14:59:31Z
dc.date.available 2024-05-05T14:59:31Z
dc.date.issued 2024
dc.description Kaya, Ercument/0000-0001-5073-8159; Oz, Isil/0000-0002-8310-1143 en_US
dc.description.abstract As Graphics Processing Units (GPUs) evolve for general-purpose computations besides inherently fault-tolerant graphics programs, soft error reliability becomes a first-class citizen in program design. Especially, safety-critical systems utilizing GPU devices need to employ fault-tolerance techniques to recover from errors in hardware components. While software-level redundancy approaches, based on the replication of the application code, offer high reliability for safe program execution, it is essential to perform redundancy by utilizing parallel execution units in the target architecture not to hurt performance with redundant computations. In this work, we propose redundancy approaches using the parallel GPU cores and implement a compiler-level redundancy framework that enables the programmer to configure the target GPGPU program for redundant execution. We run redundant executions for GPGPU programs from the PolyBench benchmark suite by applying our kernel-level redundancy approaches and evaluate their performance by considering the parallelism level of the programs. Our results reveal that redundancy approaches utilizing parallelism offered by GPU cores yield higher performance for redundant executions, while the programs that already make use of parallel GPU cores in their original form suffer from overhead caused by contention among redundant threads. © World Scientific Publishing Company. en_US
dc.description.sponsorship Scientific and Technological Research Council of Turkey (TUBITAK) [119E011] en_US
dc.description.sponsorship This work was supported by the Scientific and Technological Research Council of Turkey (TUBITAK), Grant No: 119E011. We thank Martin Ruefenacht for his valuable comments on the paper. en_US
dc.identifier.doi 10.1142/S0218126624502542
dc.identifier.issn 0218-1266
dc.identifier.issn 1793-6454
dc.identifier.scopus 2-s2.0-85190833876
dc.identifier.uri https://doi.org/10.1142/S0218126624502542
dc.identifier.uri https://hdl.handle.net/11147/14403
dc.language.iso en en_US
dc.publisher World Scientific en_US
dc.relation.ispartof Journal of Circuits, Systems and Computers en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject compiler support en_US
dc.subject GPU computing en_US
dc.subject redundancy en_US
dc.subject soft errors en_US
dc.title Compiler-Managed Replication of Cuda Kernels for Reliable Execution of Gpgpu Applications en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id Kaya, Ercument/0000-0001-5073-8159
gdc.author.id Oz, Isil/0000-0002-8310-1143
gdc.author.id Kaya, Ercument / 0000-0001-5073-8159 en_US
gdc.author.id Oz, Isil / 0000-0002-8310-1143 en_US
gdc.author.scopusid 57727235800
gdc.author.scopusid 37097877800
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Izmir Institute of Technology en_US
gdc.description.departmenttemp Kaya E., Izmir Institute of Technology, Computer Engineering Department, Urla, Izmir, 35433, Turkey; Öz I., Izmir Institute of Technology, Computer Engineering Department, Urla, Izmir, 35433, Turkey en_US
gdc.description.issue 14 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q3
gdc.description.volume 33 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q4
gdc.identifier.openalex W4392753217
gdc.identifier.wos WOS:001205493700001
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 0.0
gdc.oaire.influence 2.635068E-9
gdc.oaire.isgreen true
gdc.oaire.popularity 3.0009937E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0103 physical sciences
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.oaire.sciencefields 01 natural sciences
gdc.openalex.collaboration National
gdc.openalex.fwci 0.0
gdc.openalex.normalizedpercentile 0.02
gdc.opencitations.count 0
gdc.plumx.mendeley 1
gdc.plumx.scopuscites 0
gdc.scopus.citedcount 0
gdc.wos.citedcount 0
relation.isAuthorOfPublication.latestForDiscovery e0de33d0-b187-47e9-bae7-9b17aaabeb67
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4003-8abe-a4dfe192da5e

Files