Soft Error Vulnerability Prediction of Gpgpu Applications

Loading...

Date

Authors

Topçu, Burak
Öz, Işıl

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

As graphics processing units (GPUs) evolve to offer high performance for general-purpose computations in addition to inherently fault-tolerant graphics applications, soft error reliability becomes a significant concern. Fault injection provides a method of evaluating the soft error vulnerability of target programs. Since performing fault injection experiments for complex GPU hardware structures takes impractical times, the prediction-based techniques to evaluate the soft error vulnerability of general-purpose GPU (GPGPU) programs based on metrics from different domains get crucial for both HPC developers and GPU vendors. In this work, we propose machine learning (ML)-based prediction frameworks for the soft error vulnerability evaluation of GPGPU programs. We consider program characteristics, hardware usage and performance metrics collected from the simulation and the profiling tools. While we utilize regression models to predict the masked fault rates, we build classification models to specify the vulnerability level of the GPGPU programs based on their silent data corruption (SDC) and crash rates. Our prediction models achieve maximum prediction accuracy rates of 95.9, 88.46, and 85.7% for masked fault rates, SDCs, and crashes, respectively

Description

This work was supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK), Grant No: 119E011.

Keywords

Computer graphics, Computer hardware, Error correction, Graphics processing unit

Fields of Science

0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Citation

WoS Q

Scopus Q

OpenCitations Logo
OpenCitations Citation Count
N/A

Volume

79

Issue

Start Page

6965

End Page

6990
PlumX Metrics
Citations

Scopus : 0

Captures

Mendeley Readers : 2

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.21529344

Sustainable Development Goals