Molecular Biology and Genetics / Moleküler Biyoloji ve Genetik

Permanent URI for this collectionhttps://hdl.handle.net/11147/9

Browse

Search Results

Now showing 1 - 3 of 3
  • Conference Object
    Citation - WoS: 3
    Citation - Scopus: 8
    Distinguishing Between Microrna Targets From Diverse Species Using Sequence Motifs and K-Mers
    (SCITEPRESS, 2017) Yousef, Malik; Khalifa, Waleed; Acar, İlhan Erkin; Allmer, Jens
    A disease phenotype is often due to dysregulation of gene expression. Post-translational regulation of protein abundance by microRNAs (miRNAs) is, therefore, of high importance in, for example, cancer studies. MicroRNAs provide a complementary sequence to their target messenger RNA (mRNA) as part of a complex molecular machinery. Known miRNAs and targets are listed in miRTarBase for a variety of organisms. The experimental detection of such pairs is convoluted and, therefore, their computational detection is desired which is complicated by missing negative data. For machine learning, many features for parameterization of the miRNA targets are available and k-mers and sequence motifs have previously been used. Unrelated organisms like intracellular pathogens and their hosts may communicate via miRNAs and, therefore, we investigated whether miRNA targets from one species can be differentiated from miRNA targets of another. To achieve this end, we employed target information of one species as positive and the other as negative training and testing data. Models of species with higher evolutionary distance generally achieved better results of up to 97% average accuracy (mouse versus Caenorhabditis elegans) while more closely related species did not lead to successful models (human versus mouse; 60%). In the future, when more targeting data becomes available, models can be established which will be able to more precisely determine miRNA targets in hostpathogen systems using this approach.
  • Conference Object
    Citation - Scopus: 13
    Feature Selection for Microrna Target Prediction Comparison of One-Class Feature Selection Methodologies
    (Hindawi Publishing Corporation, 2016) Yousef, Malik; Allmer, Jens; Khalifa, Waleed
    Traditionally, machine learning algorithms build classification models from positive and negative examples. Recently, one-class classification (OCC) receives increasing attention in machine learning for problems where the negative class cannot be defined unambiguously. This is specifically problematic in bioinformatics since for some important biological problems the target class (positive class) is easy to obtain while the negative one cannot be measured. Artificially generating the negative class data can be based on unreliable assumptions. Several studies have applied two-class machine learning to predict microRNAs (miRNAs) and their target. Different approaches for the generation of an artificial negative class have been applied, but may lead to a biased performance estimate. Feature selection has been well studied for the two-class classification problem, while fewer methods are available for feature selection in respect to OCC. In this study, we present a feature selection approach for applying one-class classification to the prediction of miRNA targets. A comparison between one-class and two-class approaches is presented to highlight that their performance are similar while one-class classification is not based on questionable artificial data for training and performance evaluation. We further show that the feature selection method we tried works to a degree, but needs improvement in the future. Perhaps it could be combined with other approaches.
  • Article
    Citation - Scopus: 19
    Feature Selection Has a Large Impact on One-Class Classification Accuracy for Micrornas in Plants
    (Hindawi Publishing Corporation, 2016) Yousef, Malik; Demirci, Müşerref Duygu Saçar; Khalifa, Waleed; Allmer, Jens
    MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of 95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.