Regional Soft Error Vulnerability and Error Propagation Analysis for Gpgpu Applications

dc.contributor.author Öz, Işıl
dc.contributor.author Karadaş, Ömer Faruk
dc.date.accessioned 2021-11-06T09:48:29Z
dc.date.available 2021-11-06T09:48:29Z
dc.date.issued 2021
dc.description.abstract The wide use of GPUs for general-purpose computations as well as graphics programs makes soft errors a critical concern. Evaluating the soft error vulnerability of GPGPU programs and employing efficient fault tolerance techniques for more reliable execution become more important. Protecting only the most error-sensitive program regions maintains an acceptable reliability level by eliminating the large performance overheads due to redundant operations. Therefore, fine-grained regional soft error vulnerability analysis is crucial for the systems targeting both performance and reliability. In this work, we present a regional fault injection framework and perform a detailed error propagation analysis to evaluate the soft error vulnerability of GPGPU applications. We evaluate both intra-kernel and inter-kernel vulnerabilities for a set of programs and quantify the severity of the data corruptions by considering metrics other than SDC rates. Our experimental study demonstrates that the code regions inside GPGPU programs exhibit different characteristics in terms of soft error vulnerability and the soft errors corrupting the variables propagate into the program output in several ways. We present the potential impact of our analysis by discussing the usage scenarios after we compile our observations acquired from our empirical work. en_US
dc.description.sponsorship This work was supported by the Scientific and Technological Research Council of Turkey (TuBTAK), Grant No: 119E011. en_US
dc.identifier.doi 10.1007/s11227-021-04026-6
dc.identifier.issn 0920-8542
dc.identifier.issn 1573-0484
dc.identifier.scopus 2-s2.0-85113722511
dc.identifier.uri https://doi.org/10.1007/s11227-021-04026-6
dc.identifier.uri https://hdl.handle.net/11147/11400
dc.language.iso en en_US
dc.publisher Springer en_US
dc.relation.ispartof Journal of Supercomputing en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Soft error reliability en_US
dc.subject GPGPU programs en_US
dc.subject Fault injection en_US
dc.title Regional Soft Error Vulnerability and Error Propagation Analysis for Gpgpu Applications en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id 0000-0002-8310-1143
gdc.author.id 0000-0002-8310-1143 en_US
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department İzmir Institute of Technology. Computer Engineering en_US
gdc.description.endpage 4130
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.startpage 4095
gdc.description.volume 78
gdc.description.wosquality Q2
gdc.identifier.openalex W3196277062
gdc.identifier.wos WOS:000687489300002
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 4.0
gdc.oaire.influence 2.8773148E-9
gdc.oaire.isgreen true
gdc.oaire.popularity 5.6769425E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.openalex.collaboration National
gdc.openalex.fwci 0.45840711
gdc.openalex.normalizedpercentile 0.64
gdc.opencitations.count 4
gdc.plumx.mendeley 9
gdc.plumx.scopuscites 5
gdc.scopus.citedcount 5
gdc.wos.citedcount 5
relation.isAuthorOfPublication.latestForDiscovery e0de33d0-b187-47e9-bae7-9b17aaabeb67
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4014-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Name:
10.1007@s11227-021-04026-6.pdf
Size:
3 MB
Format:
Adobe Portable Document Format