Identifying Promoter and Enhancer Sequences by Graph Convolutional Networks
| dc.contributor.author | Tenekeci,S. | |
| dc.contributor.author | Tekir,S. | |
| dc.date.accessioned | 2024-05-05T14:59:36Z | |
| dc.date.available | 2024-05-05T14:59:36Z | |
| dc.date.issued | 2024 | |
| dc.description.abstract | Identification of promoters, enhancers, and their interactions helps understand genetic regulation. This study proposes a graph-based semi-supervised learning model (GCN4EPI) for the enhancer-promoter classification problem. We adopt a graph convolutional network (GCN) architecture to integrate interaction information with sequence features. Nodes of the constructed graph hold word embeddings of DNA sequences while edges hold the Enhancer-Promoter Interaction (EPI) information. By means of semi-supervised learning, much less data (16%) and time are needed in model training. Comparisons on a benchmark dataset of six human cell lines show that the proposed approach outperforms the state-of-the-art methods by a large margin (10% higher F1 score) and has the fastest training time (up to 3 times). Moreover, GCN4EPI's performance on cross-cell line data is also better than the baselines (3% higher F1 score). Our qualitative analyses with graph explainability models prove that GCN4EPI learns from both text and graph structure. The results suggest that integrating interaction information with sequence features improves predictive performance and compensates for the number of training instances. © 2024 | en_US |
| dc.identifier.doi | 10.1016/j.compbiolchem.2024.108040 | |
| dc.identifier.issn | 1476-9271 | |
| dc.identifier.scopus | 2-s2.0-85186593399 | |
| dc.identifier.uri | https://doi.org/10.1016/j.compbiolchem.2024.108040 | |
| dc.identifier.uri | https://hdl.handle.net/11147/14420 | |
| dc.language.iso | en | en_US |
| dc.publisher | Elsevier Ltd | en_US |
| dc.relation.ispartof | Computational Biology and Chemistry | en_US |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | Enhancer | en_US |
| dc.subject | Graph convolutional networks | en_US |
| dc.subject | Natural language processing | en_US |
| dc.subject | Promoter | en_US |
| dc.subject | Sequence analysis | en_US |
| dc.title | Identifying Promoter and Enhancer Sequences by Graph Convolutional Networks | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication | |
| gdc.author.scopusid | 57340107000 | |
| gdc.author.scopusid | 16234844500 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C5 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::journal::journal article | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | Izmir Institute of Technology | en_US |
| gdc.description.departmenttemp | Tenekeci S., Department of Computer Engineering, Izmir Institute of Technology, Izmir, 35430, Turkey; Tekir S., Department of Computer Engineering, Izmir Institute of Technology, Izmir, 35430, Turkey | en_US |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q3 | |
| gdc.description.volume | 110 | en_US |
| gdc.description.wosquality | Q1 | |
| gdc.identifier.openalex | W4392301339 | |
| gdc.identifier.pmid | 38430611 | |
| gdc.identifier.wos | WOS:001209523700001 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 1.0 | |
| gdc.oaire.influence | 2.6398488E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.keywords | Enhancer Elements, Genetic | |
| gdc.oaire.keywords | Humans | |
| gdc.oaire.keywords | Neural Networks, Computer | |
| gdc.oaire.keywords | Promoter Regions, Genetic | |
| gdc.oaire.popularity | 3.0011467E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.openalex.collaboration | National | |
| gdc.openalex.fwci | 0.96049445 | |
| gdc.openalex.normalizedpercentile | 0.64 | |
| gdc.opencitations.count | 0 | |
| gdc.plumx.mendeley | 6 | |
| gdc.plumx.pubmedcites | 1 | |
| gdc.plumx.scopuscites | 2 | |
| gdc.scopus.citedcount | 2 | |
| gdc.wos.citedcount | 1 | |
| relation.isAuthorOfPublication.latestForDiscovery | 57639474-3954-4f77-a84c-db8a079648a8 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 9af2b05f-28ac-4003-8abe-a4dfe192da5e |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- 1-s2.0-S1476927124000288-main.pdf
- Size:
- 1.46 MB
- Format:
- Adobe Portable Document Format
- Description:
- Article
