Computer Engineering / Bilgisayar Mühendisliği

Now showing 1 - 3 of 3

Soft Error Vulnerability Prediction of Gpgpu Applications
(Springer, 2022) Topçu, Burak; Öz, Işıl
As graphics processing units (GPUs) evolve to offer high performance for general-purpose computations in addition to inherently fault-tolerant graphics applications, soft error reliability becomes a significant concern. Fault injection provides a method of evaluating the soft error vulnerability of target programs. Since performing fault injection experiments for complex GPU hardware structures takes impractical times, the prediction-based techniques to evaluate the soft error vulnerability of general-purpose GPU (GPGPU) programs based on metrics from different domains get crucial for both HPC developers and GPU vendors. In this work, we propose machine learning (ML)-based prediction frameworks for the soft error vulnerability evaluation of GPGPU programs. We consider program characteristics, hardware usage and performance metrics collected from the simulation and the profiling tools. While we utilize regression models to predict the masked fault rates, we build classification models to specify the vulnerability level of the GPGPU programs based on their silent data corruption (SDC) and crash rates. Our prediction models achieve maximum prediction accuracy rates of 95.9, 88.46, and 85.7% for masked fault rates, SDCs, and crashes, respectively
Citation - Scopus: 1
Model-Based Ideal Testing of Hardware Description Language (hdl) Programs
(Springer, 2021) Kılınççeker, Onur; Türk, Ercüment; Belli, Fevzi; Challenger, Moharram
An ideal test is supposed to show not only the presence of bugs but also their absence. Based on the Fundamental Test Theory of Goodenough and Gerhart (IEEE Trans Softw Eng SE-1(2):156–173, 1975), this paper proposes an approach to model-based ideal testing of hardware description language (HDL) programs based on their behavioral model. Test sequences are generated from both original (fault-free) and mutant (faulty) models in the sense of positive and negative testing, forming a holistic test view. These test sequences are then executed on original (fault-free) and mutant (faulty) HDL programs, in the sense of mutation testing. Using the techniques known from automata theory, test selection criteria are developed and formally show that they fulfill the major requirements of Fundamental Test Theory, that is, reliability and validity. The current paper comprises a preparation step (consisting of the sub-steps model construction, model mutation, model conversion, and test generation) and a composition step (consisting of the sub-steps pre-selection and construction of Ideal test suites). All the steps are supported by a toolchain that is already implemented and is available online. To critically validate the proposed approach, three case studies (a sequence detector, a traffic light controller, and a RISC-V processor) are used and the strengths and weaknesses of the approach are discussed. The proposed approach achieves the highest mutation score in positive and negative testing for all case studies in comparison with two existing methods (regular expression-based test generation and context-based random test generation), using four different techniques.
Citation - WoS: 5
Citation - Scopus: 5
Regional Soft Error Vulnerability and Error Propagation Analysis for Gpgpu Applications
(Springer, 2021) Öz, Işıl; Karadaş, Ömer Faruk
The wide use of GPUs for general-purpose computations as well as graphics programs makes soft errors a critical concern. Evaluating the soft error vulnerability of GPGPU programs and employing efficient fault tolerance techniques for more reliable execution become more important. Protecting only the most error-sensitive program regions maintains an acceptable reliability level by eliminating the large performance overheads due to redundant operations. Therefore, fine-grained regional soft error vulnerability analysis is crucial for the systems targeting both performance and reliability. In this work, we present a regional fault injection framework and perform a detailed error propagation analysis to evaluate the soft error vulnerability of GPGPU applications. We evaluate both intra-kernel and inter-kernel vulnerabilities for a set of programs and quantify the severity of the data corruptions by considering metrics other than SDC rates. Our experimental study demonstrates that the code regions inside GPGPU programs exhibit different characteristics in terms of soft error vulnerability and the soft errors corrupting the variables propagate into the program output in several ways. We present the potential impact of our analysis by discussing the usage scenarios after we compile our observations acquired from our empirical work.

Computer Engineering / Bilgisayar Mühendisliği

Browse

Filters

Settings

Sort By

Results per page

Search Results