Microrna Categorization Using Sequence Motifs and K-Mers

dc.contributor.author Yousef, Malik
dc.contributor.author Khalifa, Waleed
dc.contributor.author Acar, İlhan Erkin
dc.contributor.author Allmer, Jens
dc.coverage.doi 10.1186/s12859-017-1584-1
dc.date.accessioned 2017-10-16T08:35:26Z
dc.date.available 2017-10-16T08:35:26Z
dc.date.issued 2017
dc.description.abstract Background: Post-transcriptional gene dysregulation can be a hallmark of diseases like cancer and microRNAs (miRNAs) play a key role in the modulation of translation efficiency. Known pre-miRNAs are listed in miRBase, and they have been discovered in a variety of organisms ranging from viruses and microbes to eukaryotic organisms. The computational detection of pre-miRNAs is of great interest, and such approaches usually employ machine learning to discriminate between miRNAs and other sequences. Many features have been proposed describing pre-miRNAs, and we have previously introduced the use of sequence motifs and k-mers as useful ones. There have been reports of xeno-miRNAs detected via next generation sequencing. However, they may be contaminations and to aid that important decision-making process, we aimed to establish a means to differentiate pre-miRNAs from different species. Results: To achieve distinction into species, we used one species' pre-miRNAs as the positive and another species' pre-miRNAs as the negative training and test data for the establishment of machine learned models based on sequence motifs and k-mers as features. This approach resulted in higher accuracy values between distantly related species while species with closer relation produced lower accuracy values. Conclusions: We were able to differentiate among species with increasing success when the evolutionary distance increases. This conclusion is supported by previous reports of fast evolutionary changes in miRNAs since even in relatively closely related species a fairly good discrimination was possible. en_US
dc.description.sponsorship Scientific and Technological Research Council of Turkey (113E326); Zefat Academic College en_US
dc.identifier.citation Yousef, M., Khalifa, W., Acar, İ. E., and Allmer, J. (2017). MicroRNA categorization using sequence motifs and k-mers. BMC Bioinformatics, 18(1). doi:10.1186/s12859-017-1584-1 en_US
dc.identifier.doi 10.1186/s12859-017-1584-1 en_US
dc.identifier.doi 10.1186/s12859-017-1584-1
dc.identifier.issn 1471-2105
dc.identifier.scopus 2-s2.0-85015613147
dc.identifier.uri http://doi.org/10.1186/s12859-017-1584-1
dc.identifier.uri https://hdl.handle.net/11147/6359
dc.language.iso en en_US
dc.publisher BioMed Central Ltd. en_US
dc.relation info:eu-repo/grantAgreement/TUBITAK/EEEAG/113E326 en_US
dc.relation.ispartof BMC Bioinformatics en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Machine learning en_US
dc.subject MicroRNAs en_US
dc.subject MiRNA categorization en_US
dc.subject Sequence motifs en_US
dc.subject Differentiate miRNAs among species en_US
dc.title Microrna Categorization Using Sequence Motifs and K-Mers en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.institutional Acar, İlhan Erkin
gdc.author.institutional Allmer, Jens
gdc.author.yokid 107974
gdc.bip.impulseclass C4
gdc.bip.influenceclass C4
gdc.bip.popularityclass C4
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department İzmir Institute of Technology. Molecular Biology and Genetics en_US
gdc.description.issue 1 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.volume 18 en_US
gdc.description.wosquality Q1
gdc.identifier.openalex W2602538792
gdc.identifier.pmid 28292266
gdc.identifier.wos WOS:000397508500004
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.downloads 0
gdc.oaire.impulse 13.0
gdc.oaire.influence 3.628509E-9
gdc.oaire.isgreen true
gdc.oaire.keywords MiRNA categorization
gdc.oaire.keywords Base Sequence
gdc.oaire.keywords High-Throughput Nucleotide Sequencing
gdc.oaire.keywords Fabaceae
gdc.oaire.keywords Biochemistry
gdc.oaire.keywords Differentiate miRNAs among species
gdc.oaire.keywords Computer Science Applications
gdc.oaire.keywords Sequence motifs
gdc.oaire.keywords MicroRNAs
gdc.oaire.keywords Machine learning
gdc.oaire.keywords RNA Precursors
gdc.oaire.keywords Animals
gdc.oaire.keywords Humans
gdc.oaire.keywords Molecular Biology
gdc.oaire.keywords Phylogeny
gdc.oaire.keywords Research Article
gdc.oaire.popularity 1.3956782E-8
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0301 basic medicine
gdc.oaire.sciencefields 0206 medical engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.oaire.sciencefields 03 medical and health sciences
gdc.oaire.views 2
gdc.openalex.collaboration International
gdc.openalex.fwci 2.00422064
gdc.openalex.normalizedpercentile 0.83
gdc.opencitations.count 26
gdc.plumx.crossrefcites 8
gdc.plumx.mendeley 59
gdc.plumx.pubmedcites 9
gdc.plumx.scopuscites 25
gdc.scopus.citedcount 25
gdc.wos.citedcount 20
relation.isAuthorOfPublication.latestForDiscovery bf9f97a4-6d62-49cd-a7c8-1bc8463d14d2
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4003-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Name:
6359.pdf
Size:
773.31 KB
Format:
Adobe Portable Document Format
Description:
Makale

License bundle

Now showing 1 - 1 of 1
Loading...
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: