Data Mining for Microrna Gene Prediction: on the Impact of Class Imbalance and Feature Number for Microrna Gene Prediction
| dc.contributor.author | Saçar, Müşerref Duygu | |
| dc.contributor.author | Allmer, Jens | |
| dc.coverage.doi | 10.1109/HIBIT.2013.6661685 | |
| dc.date.accessioned | 2017-04-17T11:24:11Z | |
| dc.date.available | 2017-04-17T11:24:11Z | |
| dc.date.issued | 2013 | |
| dc.description | 8th International Symposium on Health Informatics and Bioinformatics, HIBIT 2013; Ankara; Turkey; 25 September 2013 through 27 September 2013 | en_US |
| dc.description.abstract | MicroRNAs (miRNAs) are small, non-coding RNAs which are involved in the posttranscriptional modulation of gene expression. Their short (18-24) single stranded mature sequences are involved in targeting specific genes. It turns out that experimental methods are limited and that it is difficult, if not impossible, to establish all miRNAs and their targets experimentally. Therefore, many tools for the prediction of miRNA genes and miRNA targets have been proposed. Most of these tools are based on machine learning methods and within that area mostly two-class classification is employed. Unfortunately, truly negative data is impossible to attain and only approximations of negative data are currently available. Also, we recently showed that the available positive data is not flawless. Here we investigate the impact of class imbalance on the learner accuracy and find that there is a difference of up to 50% between the best and worst precision and recall values. In addition, we looked at increasing number of features and found a curve maximizing at 0.97 recall and 0.91 precision with quickly decaying performance after inclusion of more than 100 features. © 2013 IEEE. | en_US |
| dc.identifier.citation | Saçar, M. D., and Allmer, J. (2013, September 25-27). Data mining for microrna gene prediction: On the impact of class imbalance and feature number for microrna gene prediction. Paper presented at the 8th International Symposium on Health Informatics and Bioinformatics. doi:10.1109/HIBIT.2013.6661685 | en_US |
| dc.identifier.doi | 10.1109/HIBIT.2013.6661685 | en_US |
| dc.identifier.doi | 10.1109/HIBIT.2013.6661685 | |
| dc.identifier.isbn | 9781479907014 | |
| dc.identifier.scopus | 2-s2.0-84892650223 | |
| dc.identifier.uri | http://doi.org/10.1109/HIBIT.2013.6661685 | |
| dc.identifier.uri | https://hdl.handle.net/11147/5322 | |
| dc.language.iso | en | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | en_US |
| dc.relation.ispartof | 8th International Symposium on Health Informatics and Bioinformatics, HIBIT 2013 | en_US |
| dc.rights | info:eu-repo/semantics/openAccess | en_US |
| dc.subject | Class imbalance | en_US |
| dc.subject | Data mining | en_US |
| dc.subject | Feature selection | en_US |
| dc.subject | Machine learning | en_US |
| dc.subject | MicroRNAs | en_US |
| dc.subject | MiRNA gene prediction | en_US |
| dc.title | Data Mining for Microrna Gene Prediction: on the Impact of Class Imbalance and Feature Number for Microrna Gene Prediction | en_US |
| dc.type | Conference Object | en_US |
| dspace.entity.type | Publication | |
| gdc.author.institutional | Saçar, Müşerref Duygu | |
| gdc.author.institutional | Allmer, Jens | |
| gdc.author.yokid | 114170 | |
| gdc.author.yokid | 107974 | |
| gdc.bip.impulseclass | C4 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C4 | |
| gdc.coar.access | open access | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | İzmir Institute of Technology. Molecular Biology and Genetics | en_US |
| gdc.description.endpage | 6 | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | N/A | |
| gdc.description.startpage | 1 | |
| gdc.description.wosquality | N/A | |
| gdc.identifier.openalex | W2081525771 | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 5.0 | |
| gdc.oaire.influence | 3.352486E-9 | |
| gdc.oaire.isgreen | true | |
| gdc.oaire.keywords | MicroRNAs | |
| gdc.oaire.keywords | Class imbalance | |
| gdc.oaire.keywords | Feature selection | |
| gdc.oaire.keywords | Machine learning | |
| gdc.oaire.keywords | Data mining | |
| gdc.oaire.keywords | MiRNA gene prediction | |
| gdc.oaire.popularity | 5.9006227E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 0301 basic medicine | |
| gdc.oaire.sciencefields | 03 medical and health sciences | |
| gdc.oaire.sciencefields | 0206 medical engineering | |
| gdc.oaire.sciencefields | 02 engineering and technology | |
| gdc.openalex.collaboration | National | |
| gdc.openalex.fwci | 1.04579925 | |
| gdc.openalex.normalizedpercentile | 0.78 | |
| gdc.openalex.toppercent | TOP 10% | |
| gdc.opencitations.count | 12 | |
| gdc.plumx.crossrefcites | 1 | |
| gdc.plumx.mendeley | 15 | |
| gdc.plumx.scopuscites | 19 | |
| gdc.scopus.citedcount | 19 | |
| relation.isAuthorOfPublication.latestForDiscovery | bf9f97a4-6d62-49cd-a7c8-1bc8463d14d2 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 9af2b05f-28ac-4013-8abe-a4dfe192da5e |
