Molecular Biology and Genetics / Moleküler Biyoloji ve Genetik

Permanent URI for this collectionhttps://hdl.handle.net/11147/9

Browse

Search Results

Now showing 1 - 5 of 5
  • Editorial
    Citation - WoS: 2
    Computational Mirnomics - Integrative Approaches
    (Informationsmanagement in der Biotechnologie e.V. (IMBio e.V.), 2017) Hofestaedt, Ralf; Schreiber, Falk; Sommer, Bjoern; Allmer, Jens
    With this special issue on Computational miRNomics, we would like to start a new generation of publications in the Journal of Integrative Bioinformatics (JIB). From 2017 onwards, JIB will be published by De Gruyter which is one of the largest Open Access publishers in Germany with a long history. Established in 1918 with roots reaching even further back, the JIB editorial board decided that De Gruyter is the perfect partner to increase the level of professionalism for our publication processing and journal development.
  • Article
    Citation - WoS: 4
    Citation - Scopus: 9
    Pgminer Reloaded, Fully Automated Proteogenomic Annotation Tool Linking Genomes To Proteomes
    (Informationsmanagement in der Biotechnologie e.V. (IMBio e.V.), 2016) Has, Canan; Lashin, Sergey A.; Kochetov, Alexey; Allmer, Jens
    Improvements in genome sequencing technology increased the availability of full genomes and transcriptomes of many organisms. However, the major benefit of massive parallel sequencing is to better understand the organization and function of genes which then lead to understanding of phenotypes. In order to interpret genomic data with automated gene annotation studies, several tools are currently available. Even though the accuracy of computational gene annotation is increasing, a combination of multiple lines of experimental evidences should be gathered. Mass spectrometry allows the identification and sequencing of proteins as major gene products; and it is only these proteins that conclusively show whether a part of a genome is a coding region or not to result in phenotypes. Therefore, in the field of proteogenomics, the validation of computational methods is done by exploiting mass spectrometric data. As a result, identification of novel protein coding regions, validation of current gene models, and determination of upstream and downstream regions of genes can be achieved. In this paper, we present new functionality for our proteogenomic tool, PGMiner which performs all proteogenomic steps like acquisition of mass spectrometric data, peptide identification against preprocessed sequence databases, assignment of statistical confidence to identified peptides, mapping confident peptides to gene models, and result visualization. The extensions cover determining proteotypic peptides and thus unambiguous protein identification. Furthermore, peptides conflicting with gene models can now automatically assessed within the context of predicted alternative open reading frames.
  • Article
    Citation - WoS: 7
    Citation - Scopus: 5
    A Machine Learning Approach for Microrna Precursor Prediction in Retro-Transcribing Virus Genomes
    (Informationsmanagement in der Biotechnologie e.V. (IMBio e.V.), 2016) Saçar Demirci, Müşerref Duygu; Toprak, Mustafa; Allmer, Jens
    Identification of microRNA (miRNA) precursors has seen increased efforts in recent years. The difficulty in experimental detection of pre-miRNAs increased the usage of computational approaches. Most of these approaches rely on machine learning especially classification. In order to achieve successful classification, many parameters need to be considered such as data quality, choice of classifier settings, and feature selection. For the latter one, we developed a distributed genetic algorithm on HTCondor to perform feature selection. Moreover, we employed two widely used classification algorithms libSVM and random forest with different settings to analyze the influence on the overall classification performance. In this study we analyzed 5 human retro virus genomes; Human endogenous retrovirus K113, Hepatitis B virus (strain ayw), Human T lymphotropic virus 1, Human T lymphotropic virus 2, Human immunodeficiency virus 2, and Human immunodeficiency virus 1. We then predicted pre-miRNAs by using the information from known virus and human pre-miRNAs. Our results indicate that these viruses produce novel unknown miRNA precursors which warrant further experimental validation.
  • Article
    Citation - WoS: 4
    Citation - Scopus: 4
    Improving the Quality of Positive Datasets for the Establishment of Machine Learning Models for Pre-Microrna Detection
    (Informationsmanagement in der Biotechnologie e.V. (IMBio e.V.), 2017) Saçar Demirci, Müşerref Duygu; Allmer, Jens
    MicroRNAs (miRNAs) are involved in the post-transcriptional regulation of protein abundance and thus have a great impact on the resulting phenotype. It is, therefore, no wonder that they have been implicated in many diseases ranging from virus infections to cancer. This impact on the phenotype leads to a great interest in establishing the miRNAs of an organism. Experimental methods are complicated which led to the development of computational methods for pre-miRNA detection. Such methods generally employ machine learning to establish models for the discrimination between miRNAs and other sequences. Positive training data for model establishment, for the most part, stems from miRBase, the miRNA registry. The quality of the entries in miRBase has been questioned, though. This unknown quality led to the development of filtering strategies in attempts to produce high quality positive datasets which can lead to a scarcity of positive data. To analyze the quality of filtered data we developed a machine learning model and found it is well able to establish data quality based on intrinsic measures. Additionally, we analyzed which features describing pre-miRNAs could discriminate between low and high quality data. Both models are applicable to data from miRBase and can be used for establishing high quality positive data. This will facilitate the development of better miRNA detection tools which will make the prediction of miRNAs in disease states more accurate. Finally, we applied both models to all miRBase data and provide the list of high quality hairpins.
  • Article
    Citation - WoS: 25
    Citation - Scopus: 21
    Can Mirbase Provide Positive Data for Machine Learning for the Detection of Mirna Hairpins?
    (Informationsmanagement in der Biotechnologie e.V. (IMBio e.V.), 2013) Demirci, Müşerref Duygu Saçar; Hamzeiy, Hamid; Allmer, Jens
    Experimental detection and validation of miRNAs is a tedious, time-consuming, and expensive process. Computational methods for miRNA gene detection are being developed so that the number of candidates that need experimental validation can be reduced to a manageable amount. Computational methods involve homology-based and ab inito algorithms. Both approaches are dependent on positive and negative training examples. Positive examples are usually derived from miRBase, the main resource for experimentally validated miRNAs. We encountered some problems with miRBase which we would like to report here. Some problems, among others, we encountered are that folds presented in miRBase are not always the fold with the minimum free energy; some entries do not seem to conform to expectations of miRNAs, and some external accession numbers are not valid. In addition, we compared the prediction accuracy for the same negative dataset when the positive data came from miRBase or miRTarBase and found that the latter led to more precise prediction models. We suggest that miRBase should introduce some automated facilities for ensuring data quality to overcome these problems.