WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Permanent URI for this collectionhttps://hdl.handle.net/11147/7150
Browse
4 results
Search Results
Article Citation - WoS: 4Citation - Scopus: 4Predicting the Soft Error Vulnerability of Parallel Applications Using Machine Learning(Springer, 2021) Öz, Işıl; Arslan, SanemWith the widespread use of the multicore systems having smaller transistor sizes, soft errors become an important issue for parallel program execution. Fault injection is a prevalent method to quantify the soft error rates of the applications. However, it is very time consuming to perform detailed fault injection experiments. Therefore, prediction-based techniques have been proposed to evaluate the soft error vulnerability in a faster way. In this work, we present a soft error vulnerability prediction approach for parallel applications using machine learning algorithms. We define a set of features including thread communication, data sharing, parallel programming, and performance characteristics; and train our models based on three ML algorithms. This study uses the parallel programming features, as well as the combination of all features for the first time in vulnerability prediction of parallel programs. We propose two models for the soft error vulnerability prediction: (1) A regression model with rigorous feature selection analysis that estimates correct execution rates, (2) A novel classification model that predicts the vulnerability level of the target programs. We get maximum prediction accuracy rate of 73.2% for the regression-based model, and achieve 89% F-score for our classification model.Article Citation - WoS: 5Citation - Scopus: 7Fast Texture Classification of Denoised Sar Image Patches Using Glcm on Spark(Türkiye Klinikleri Journal of Medical Sciences, 2020) Özcan, Caner; Ersoy, Okan; Oğul, İskender ÜlgenClassification of a synthetic aperture radar (SAR) image is an essential process for SAR image analysis and interpretation. Recent advances in imaging technologies have allowed data sizes to grow, and a large number of applications in many areas have been generated. However, analysis of high-resolution SAR images, such as classification, is a time-consuming process and high-speed algorithms are needed. In this study, classification of high-speed denoised SAR image patches by using Apache Spark clustering framework is presented. Spark is preferred due to its powerful open-source cluster-computing framework with fast, easy-to-use, and in-memory analytics. Classification of SAR images is realized on patch level by using the supervised learning algorithms embedded in the Spark machine learning library. The feature vectors used as the classifier input are obtained using gray-level cooccurrence matrix which is chosen to quantitatively evaluate textural parameters and representations. SAR image patches used to construct the feature vectors are first applied to the noise reduction algorithm to obtain a more accurate classification accuracy. Experimental studies were carried out using naive Bayes, decision tree, and random forest algorithms to provide comparative results, and significant accuracies were achieved. The results were also compared with a state-of-the-art deep learning method. TerraSAR-X images of high-resolution real-world SAR images were used as data.Article Citation - WoS: 4Citation - Scopus: 4Improving the Quality of Positive Datasets for the Establishment of Machine Learning Models for Pre-Microrna Detection(Informationsmanagement in der Biotechnologie e.V. (IMBio e.V.), 2017) Saçar Demirci, Müşerref Duygu; Allmer, JensMicroRNAs (miRNAs) are involved in the post-transcriptional regulation of protein abundance and thus have a great impact on the resulting phenotype. It is, therefore, no wonder that they have been implicated in many diseases ranging from virus infections to cancer. This impact on the phenotype leads to a great interest in establishing the miRNAs of an organism. Experimental methods are complicated which led to the development of computational methods for pre-miRNA detection. Such methods generally employ machine learning to establish models for the discrimination between miRNAs and other sequences. Positive training data for model establishment, for the most part, stems from miRBase, the miRNA registry. The quality of the entries in miRBase has been questioned, though. This unknown quality led to the development of filtering strategies in attempts to produce high quality positive datasets which can lead to a scarcity of positive data. To analyze the quality of filtered data we developed a machine learning model and found it is well able to establish data quality based on intrinsic measures. Additionally, we analyzed which features describing pre-miRNAs could discriminate between low and high quality data. Both models are applicable to data from miRBase and can be used for establishing high quality positive data. This will facilitate the development of better miRNA detection tools which will make the prediction of miRNAs in disease states more accurate. Finally, we applied both models to all miRBase data and provide the list of high quality hairpins.Article Citation - WoS: 11Citation - Scopus: 14Categorization of Species Based on Their Micrornas Employing Sequence Motifs, Information-Theoretic Sequence Feature Extraction, and K-Mers(Springer Verlag, 2017) Yousef, Malik; Nigatu, Dawit; Levy, Dalit; Allmer, Jens; Henkel, WernerBackground: Diseases like cancer can manifest themselves through changes in protein abundance, and microRNAs (miRNAs) play a key role in the modulation of protein quantity. MicroRNAs are used throughout all kingdoms and have been shown to be exploited by viruses to modulate their host environment. Since the experimental detection of miRNAs is difficult, computational methods have been developed. Many such tools employ machine learning for pre-miRNA detection, and many features for miRNA parameterization have been proposed. To train machine learning models, negative data is of importance yet hard to come by; therefore, we recently started to employ pre-miRNAs from one species as positive data versus another species’ pre-miRNAs as negative examples based on sequence motifs and k-mers. Here, we introduce the additional usage of information-theoretic (IT) features. Results: Pre-miRNAs from one species were used as positive and another species’ pre-miRNAs as negative training data for machine learning. The categorization capability of IT and k-mer features was investigated. Both feature sets and their combinations yielded a very high accuracy, which is as good as the previously suggested sequence motif and k-mer based method. However, for obtaining a high performance, a sufficiently large phylogenetic distance between the species and sufficiently high number of pre-miRNAs in the training set is required. To examine the contribution of the IT and k-mer features, an information gain-based feature ranking was performed. Although the top 3 are IT features, 80% of the top 100 features are k-mers. The comparison of all three individual approaches (motifs, IT, and k-mers) shows that the distinction of species based on their pre-miRNAs k-mers are sufficient. Conclusions: IT sequence feature extraction enables the distinction among species and is less computationally expensive than motif calculations. However, since IT features need larger amounts of data to have enough statistics for producing highly accurate results, future categorization into species can be effectively done using k-mers only. The biological reasoning for this is the existence of a codon bias between species which can, at least, be observed in exonic miRNAs. Future work in this direction will be the ab initio detection of pre-miRNA. In addition, prediction of pre-miRNA from RNA-seq can be done.
