Molecular Biology and Genetics / Moleküler Biyoloji ve Genetik

Permanent URI for this collectionhttps://hdl.handle.net/11147/9

Browse

Search Results

Now showing 1 - 3 of 3
  • Conference Object
    Citation - WoS: 3
    Citation - Scopus: 8
    Distinguishing Between Microrna Targets From Diverse Species Using Sequence Motifs and K-Mers
    (SCITEPRESS, 2017) Yousef, Malik; Khalifa, Waleed; Acar, İlhan Erkin; Allmer, Jens
    A disease phenotype is often due to dysregulation of gene expression. Post-translational regulation of protein abundance by microRNAs (miRNAs) is, therefore, of high importance in, for example, cancer studies. MicroRNAs provide a complementary sequence to their target messenger RNA (mRNA) as part of a complex molecular machinery. Known miRNAs and targets are listed in miRTarBase for a variety of organisms. The experimental detection of such pairs is convoluted and, therefore, their computational detection is desired which is complicated by missing negative data. For machine learning, many features for parameterization of the miRNA targets are available and k-mers and sequence motifs have previously been used. Unrelated organisms like intracellular pathogens and their hosts may communicate via miRNAs and, therefore, we investigated whether miRNA targets from one species can be differentiated from miRNA targets of another. To achieve this end, we employed target information of one species as positive and the other as negative training and testing data. Models of species with higher evolutionary distance generally achieved better results of up to 97% average accuracy (mouse versus Caenorhabditis elegans) while more closely related species did not lead to successful models (human versus mouse; 60%). In the future, when more targeting data becomes available, models can be established which will be able to more precisely determine miRNA targets in hostpathogen systems using this approach.
  • Article
    Citation - WoS: 20
    Citation - Scopus: 25
    Microrna Categorization Using Sequence Motifs and K-Mers
    (BioMed Central Ltd., 2017) Yousef, Malik; Khalifa, Waleed; Acar, İlhan Erkin; Allmer, Jens
    Background: Post-transcriptional gene dysregulation can be a hallmark of diseases like cancer and microRNAs (miRNAs) play a key role in the modulation of translation efficiency. Known pre-miRNAs are listed in miRBase, and they have been discovered in a variety of organisms ranging from viruses and microbes to eukaryotic organisms. The computational detection of pre-miRNAs is of great interest, and such approaches usually employ machine learning to discriminate between miRNAs and other sequences. Many features have been proposed describing pre-miRNAs, and we have previously introduced the use of sequence motifs and k-mers as useful ones. There have been reports of xeno-miRNAs detected via next generation sequencing. However, they may be contaminations and to aid that important decision-making process, we aimed to establish a means to differentiate pre-miRNAs from different species. Results: To achieve distinction into species, we used one species' pre-miRNAs as the positive and another species' pre-miRNAs as the negative training and test data for the establishment of machine learned models based on sequence motifs and k-mers as features. This approach resulted in higher accuracy values between distantly related species while species with closer relation produced lower accuracy values. Conclusions: We were able to differentiate among species with increasing success when the evolutionary distance increases. This conclusion is supported by previous reports of fast evolutionary changes in miRNAs since even in relatively closely related species a fairly good discrimination was possible.
  • Article
    Citation - WoS: 14
    Citation - Scopus: 12
    The Impact of Feature Selection on One and Two-Class Classification Performance for Plant Micrornas
    (PeerJ Inc., 2016) Khalifa, Waleed; Yousef, Malik; Saçar Demirci, Müşerref Duygu; Allmer, Jens
    MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ~29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ~13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.