Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Permanent URI for this collectionhttps://hdl.handle.net/11147/7148

Browse

Search Results

Now showing 1 - 10 of 10

Citation - Scopus: 1
A Novel Feature To Predict Buggy Changes in a Software System
(Springer, 2022) Yılmaz, Rahime; Nalçakan, Yağız; Haktanır, Elif
Researchers have successfully implemented machine learning classifiers to predict bugs in a change file for years. Change classification focuses on determining if a new software change is clean or buggy. In the literature, several bug prediction methods at change level have been proposed to improve software reliability. This paper proposes a model for classification-based bug prediction model. Four supervised machine learning classifiers (Support Vector Machine, Decision Tree, Random Forrest, and Naive Bayes) are applied to predict the bugs in software changes, and performance of these four classifiers are characterized. We considered a public dataset and downloaded the corresponding source code and its metrics. Thereafter, we produced new software metrics by analyzing source code at class level and unified these metrics with the existing set. We obtained new dataset to apply machine learning algorithms and compared the bug prediction accuracy of the newly defined metrics. Results showed that our merged dataset is practical for bug prediction based experiments. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Citation - Scopus: 13
Feature Selection for Microrna Target Prediction Comparison of One-Class Feature Selection Methodologies
(Hindawi Publishing Corporation, 2016) Yousef, Malik; Allmer, Jens; Khalifa, Waleed
Traditionally, machine learning algorithms build classification models from positive and negative examples. Recently, one-class classification (OCC) receives increasing attention in machine learning for problems where the negative class cannot be defined unambiguously. This is specifically problematic in bioinformatics since for some important biological problems the target class (positive class) is easy to obtain while the negative one cannot be measured. Artificially generating the negative class data can be based on unreliable assumptions. Several studies have applied two-class machine learning to predict microRNAs (miRNAs) and their target. Different approaches for the generation of an artificial negative class have been applied, but may lead to a biased performance estimate. Feature selection has been well studied for the two-class classification problem, while fewer methods are available for feature selection in respect to OCC. In this study, we present a feature selection approach for applying one-class classification to the prediction of miRNA targets. A comparison between one-class and two-class approaches is presented to highlight that their performance are similar while one-class classification is not based on questionable artificial data for training and performance evaluation. We further show that the feature selection method we tried works to a degree, but needs improvement in the future. Perhaps it could be combined with other approaches.
Citation - Scopus: 19
Feature Selection Has a Large Impact on One-Class Classification Accuracy for Micrornas in Plants
(Hindawi Publishing Corporation, 2016) Yousef, Malik; Demirci, Müşerref Duygu Saçar; Khalifa, Waleed; Allmer, Jens
MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of 95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.
Citation - WoS: 299
Citation - Scopus: 406
Introduction To Machine Learning
(Humana Press, 2014) Baştanlar, Yalın; Özuysal, Mustafa
The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods.
Citation - WoS: 30
Machine Learning Methods for Microrna Gene Prediction
(Humana Press, 2014) Saçar, Müşerref Duygu; Allmer, Jens
MicroRNAs (miRNAs) are single-stranded, small, noncoding RNAs of about 22 nucleotides in length, which control gene expression at the posttranscriptional level through translational inhibition, degradation, adenylation, or destabilization of their target mRNAs. Although hundreds of miRNAs have been identified in various species, many more may still remain unknown. Therefore, discovery of new miRNA genes is an important step for understanding miRNA-mediated posttranscriptional regulation mechanisms. It seems that biological approaches to identify miRNA genes might be limited in their ability to detect rare miRNAs and are further limited to the tissues examined and the developmental stage of the organism under examination. These limitations have led to the development of sophisticated computational approaches attempting to identify possible miRNAs in silico. In this chapter, we discuss computational problems in miRNA prediction studies and review some of the many machine learning methods that have been tried to address the issues.
Citation - Scopus: 19
Data Mining for Microrna Gene Prediction: on the Impact of Class Imbalance and Feature Number for Microrna Gene Prediction
(Institute of Electrical and Electronics Engineers Inc., 2013) Saçar, Müşerref Duygu; Allmer, Jens
MicroRNAs (miRNAs) are small, non-coding RNAs which are involved in the posttranscriptional modulation of gene expression. Their short (18-24) single stranded mature sequences are involved in targeting specific genes. It turns out that experimental methods are limited and that it is difficult, if not impossible, to establish all miRNAs and their targets experimentally. Therefore, many tools for the prediction of miRNA genes and miRNA targets have been proposed. Most of these tools are based on machine learning methods and within that area mostly two-class classification is employed. Unfortunately, truly negative data is impossible to attain and only approximations of negative data are currently available. Also, we recently showed that the available positive data is not flawless. Here we investigate the impact of class imbalance on the learner accuracy and find that there is a difference of up to 50% between the best and worst precision and recall values. In addition, we looked at increasing number of features and found a curve maximizing at 0.97 recall and 0.91 precision with quickly decaying performance after inclusion of more than 100 features. © 2013 IEEE.
Citation - WoS: 18
Citation - Scopus: 25
Adaptation and Use of Subjectivity Lexicons for Domain Dependent Sentiment Classification
(Institute of Electrical and Electronics Engineers Inc., 2012) Dehkharghani, Rahim; Yanıkoğlu, Berrin; Tapucu, Dilek; Saygın, Yücel
Sentiment analysis refers to the automatic extraction of sentiments from a natural language text. We study the effect of subjectivity-based features on sentiment classification on two lexicons and also propose new subjectivity-based features for sentiment classification. The subjectivity-based features we experiment with are based on the average word polarity and the new features that we propose are based on the occurrence of subjective words in review texts. Experimental results on hotel and movie reviews show an overall accuracy of about 84% and 71% in hotel and movie review domains respectively; improving the baseline using just the average word polarities by about 2% points. © 2012 IEEE.
Citation - WoS: 22
Citation - Scopus: 29
Learning Domain-Specific Polarity Lexicons
(Institute of Electrical and Electronics Engineers Inc., 2012) Demiröz, Gülşen; Yanıkoğlu, Berrin; Tapucu, Dilek; Saygın, Yücel
Sentiment analysis aims to automatically estimate the sentiment in a given text as positive or negative. Polarity lexicons, often used in sentiment analysis, indicate how positive or negative each term in the lexicon is. However, since creating domain-specific polarity lexicons is expensive and time-consuming, researchers often use a general purpose or domain-independent lexicon. In this work, we address the problem of adapting a general purpose polarity lexicon to a specific domain and propose a simple yet effective adaptation algorithm. We experimented with two sets of reviews from the hotel and movie domains and observed that while our adaptation techniques changed the polarity values for only a small set of words, the overall test accuracy increased significantly: 77% to 83% in the hotel dataset and 61% to 66% in the movie dataset. © 2012 IEEE.
Citation - Scopus: 18
New Features for Sentiment Analysis: Do Sentences Matter?
(CEUR Workshop Proceedings, 2012) Gezici, Gizem; Yanıkoğlu, Berrin; Tapucu, Dilek; Saygın, Yücel
In this work, we propose and evaluate new features to be used in a word polarity based approach to sentiment classification. In particular, we analyze sentences as the first step before estimating the overall review polarity. We consider different aspects of sentences, such as length, purity, irrealis content, subjectivity, and position within the opinionated text. This analysis is then used to find sentences that may convey better information about the overall review polarity. The TripAdvisor dataset is used to evaluate the effect of sentence level features on polarity classification. Our initial results indicate a small improvement in classification accuracy when using the newly proposed features. However, the benefit of these features is not limited to improving sentiment classification accuracy since sentence level features can be used for other important tasks such as review summarization.
Citation - WoS: 2
Citation - Scopus: 5
Machine Learning Based Learner Modeling for Adaptive Web-Based Learning
(Springer Verlag, 2007) Aslan, Burak Galip; İnceoğlu, Mustafa Murat
Especially in the first decade of this century, learner adapted interaction and learner modeling are becoming more important in the area of web-based learning systems. The complicated nature of the problem is a serious challenge with vast amount of data available about the learners. Machine learning approaches have been used effectively in both user modeling, and learner modeling implementations. Recent studies on the challenges and solutions about learner modeling are explained in this paper with the proposal of a learner modeling framework to be used in a web-based learning system. The proposed system adopts a hybrid approach combining three machine learning techniques in three stages.

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Browse

Filters

Settings

Sort By

Results per page

Search Results