Quasi-Supervised Strategies for Compound-Protein Interaction Prediction [article]

dc.contributor.author Çakı, Onur
dc.contributor.author Karaçalı, Bilge
dc.date.accessioned 2021-12-14T11:55:19Z
dc.date.available 2021-12-14T11:55:19Z
dc.date.issued 2021
dc.description.abstract In-silico compound-protein interaction prediction addresses prioritization of drug candidates for experimental biochemical validation because the wet-lab experiments are time-consuming, laborious and costly. Most machine learning methods proposed to that end approach this problem with supervised learning strategies in which known interactions are labeled as positive and the rest are labeled as negative. However, treating all unknown interactions as negative instances may lead to inaccuracies in real practice since some of the unknown interactions are bound to be positive interactions waiting to be identified as such. In this study, we propose to address this problem using the Quasi-Supervised Learning (QSL) algorithm. In this framework, potential interactions are predicted by estimating the overlap between a true positive dataset of compound-protein pairs with known interactions and an unknown dataset of all the remaining compound-protein pairs. The potential interactions are then identified as those in the unknown dataset that overlap with the interacting pairs in the true positive dataset in terms of the associated similarity structure. We also address the class-imbalance problem by modifying the conventional cost function of the QSL algorithm. Experimental results on GPCR and Nuclear Receptor datasets show that the proposed method can identify actual interactions from all possible combinations. en_US
dc.identifier.doi 10.1002/minf.202100118
dc.identifier.issn 1868-1743 en_US
dc.identifier.issn 1868-1743
dc.identifier.issn 1868-1751
dc.identifier.scopus 2-s2.0-85119954344
dc.identifier.uri https://hdl.handle.net/11147/11861
dc.identifier.uri https://doi.org/10.1002/minf.202100118
dc.language.iso en en_US
dc.publisher Wiley-VCH Verlag en_US
dc.relation.ispartof Molecular Informatics en_US
dc.relation.uri https://hdl.handle.net/11147/11684
dc.rights info:eu-repo/semantics/embargoedAccess en_US
dc.subject Machine learning en_US
dc.subject Chemoinformatics en_US
dc.subject Drug discovery en_US
dc.subject Compound similarity en_US
dc.title Quasi-Supervised Strategies for Compound-Protein Interaction Prediction [article] en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id 0000-0002-5068-1356
gdc.author.id 0000-0002-7765-6329
gdc.author.id 0000-0002-5068-1356 en_US
gdc.author.id 0000-0002-7765-6329 en_US
gdc.author.institutional Çakı, Onur
gdc.author.institutional Karaçalı, Bilge
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access embargoed access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.contributor.affiliation Izmir Institute of Technology en_US
gdc.contributor.affiliation Izmir Institute of Technology en_US
gdc.description.department İzmir Institute of Technology. Electrical and Electronics Engineering en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.volume 41
gdc.description.wosquality Q1
gdc.identifier.openalex W3216479650
gdc.identifier.pmid 34837345
gdc.identifier.wos WOS:000722820300001
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.accesstype BRONZE
gdc.oaire.diamondjournal false
gdc.oaire.impulse 2.0
gdc.oaire.influence 2.7798779E-9
gdc.oaire.isgreen false
gdc.oaire.keywords Machine Learning
gdc.oaire.keywords Proteins
gdc.oaire.keywords Algorithms
gdc.oaire.popularity 3.7766426E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0301 basic medicine
gdc.oaire.sciencefields 0303 health sciences
gdc.oaire.sciencefields 03 medical and health sciences
gdc.openalex.collaboration National
gdc.openalex.fwci 0.32018424
gdc.openalex.normalizedpercentile 0.63
gdc.opencitations.count 3
gdc.plumx.crossrefcites 1
gdc.plumx.facebookshareslikecount 1
gdc.plumx.mendeley 12
gdc.plumx.pubmedcites 1
gdc.plumx.scopuscites 4
gdc.scopus.citedcount 4
gdc.wos.citedcount 3
relation.isAuthorOfPublication.latestForDiscovery a081f8c3-cd7b-40d5-a9ca-74707d1b4dc7
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4018-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Name:
Molecular Informatics.pdf
Size:
1.91 MB
Format:
Adobe Portable Document Format
Description:
Article (Makale)

License bundle

Now showing 1 - 1 of 1
Loading...
Name:
license.txt
Size:
3.2 KB
Format:
Item-specific license agreed upon to submission
Description: