Evaluating Performance and Reliability of Selective Redundant Multithreading for Gpgpu Applications

dc.contributor.author Kaya,E.
dc.contributor.author Karadaş,O.F.
dc.contributor.author Öz,I.
dc.date.accessioned 2024-10-25T23:27:20Z
dc.date.available 2024-10-25T23:27:20Z
dc.date.issued 2021
dc.description.abstract With the widespread use of GPU architectures in general-purpose computations, evaluating the soft error vulnerability of GPGPU programs and employing efficient fault tolerance techniques for more reliable execution becomes more prominent. Performing full redundancy, based on the redundant execution of the complete program, results in resource consumption and performance loss as well as energy inefficiency. Therefore, determining the most error-prone regions of the target program code and replicating only those parts maintains both high performance and acceptable error rates. In this study, we propose a partial redundant multithreading mechanism based on the soft error vulnerability of GPGPU applications and perform a trade-off analysis between performance and reliability. Firstly, we perform fault injection experiments to evaluate the SDC rates for each kernel function. Then, based on the outcome of the fault injection experiments, we determine the kernel function to-be-replicated. According to the pragmas denoting the redundancy points in the source code, our custom LLVM pass generates the code that enables the redundant execution for the specified code region. We evaluate both the reliability and performance of the redundant execution scenarios measuring the execution time of the redundant program generated by our compiler-managed redundancy technique. Our results demonstrate that protecting only the most vulnerable kernel functions enables high reliability without hurting the performance significantly. © 2021 The Authors. en_US
dc.description.sponsorship CERCIRAS COST, (CA19135); COST Association; TÜBÝTAK; Türkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK, (119E011) en_US
dc.identifier.issn 1613-0073
dc.identifier.scopus 2-s2.0-85131328680
dc.identifier.uri https://hdl.handle.net/11147/14911
dc.language.iso en en_US
dc.publisher CEUR-WS en_US
dc.relation.ispartof CEUR Workshop Proceedings -- 1st Workshop on Connecting Education and Research Communities for an Innovative Resource Aware Society, CERCIRAS 2021 -- 2 September 2021 -- Novi Sad -- 179621 en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Fault injection en_US
dc.subject GPGPU programs en_US
dc.subject Redundant execution en_US
dc.subject Soft error reliability en_US
dc.title Evaluating Performance and Reliability of Selective Redundant Multithreading for Gpgpu Applications en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.author.scopusid 57727235800
gdc.author.scopusid 57236778300
gdc.author.scopusid 37097877800
gdc.coar.access metadata only access
gdc.coar.type text::conference output
gdc.description.department Izmir Institute of Technology en_US
gdc.description.departmenttemp Kaya E., Computer Engineering Department, Izmir Institute of Technology, Izmir, Turkey; Karadaş O.F., Electrical Electronics Engineering Department, Izmir Institute of Technology, Izmir, Turkey; Öz I., Computer Engineering Department, Izmir Institute of Technology, Izmir, Turkey en_US
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q4
gdc.description.volume 3145 en_US
gdc.description.wosquality N/A
gdc.index.type Scopus
gdc.scopus.citedcount 0
relation.isAuthorOfPublication.latestForDiscovery e0de33d0-b187-47e9-bae7-9b17aaabeb67
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4003-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Name:
paper01.pdf
Size:
1.26 MB
Format:
Adobe Portable Document Format
Description:
article