Machine-learning-assisted de novo design of molybdenum disulfide binding peptides

Öğüt, Alp Deniz; Yücesoy, Deniz Tanıl; Apaydın, Mehmet Serkan

Machine-learning-assisted de novo design of molybdenum disulfide binding peptides

Files

14984.pdf (3.23 MB)

Date

2024

Authors

Öğüt, Alp Deniz

Yücesoy, Deniz Tanıl

Apaydın, Mehmet Serkan

Publisher

01. Izmir Institute of Technology

Abstract

Kısa amino asit zincirleri, peptitler, biyolojik süreçler ve yüksek teknoloji uygulamaları için vazgeçilmez moleküllerdir. Geniş kullanım alanları arasında, moleküler tanıma özelliği ile bio-nano arayüzler oluşturmak ilgi toplayan bir araştırma konusu olmuştur. Yapılan çalışmalar sonucunda yönlendirilmiş evrim metodolojileri oluşturulmuş ve çeşitli hedeflere -enzim, antijen veya inorganik yapılar- bağlanan fonksiyonel peptit tanısı mümkün hale gelmiştir fakat bu geleneksel yaklaşım ölçeklenebilirlik ve sekans uzayındaki ilişkilerin anlaşılması konusunda zayıflıklar taşımaktadır. Bu zafiyetler, yüksek çıktılı sekanslama ve hesaplama verimlerinin artması ile beraber derin yönlendirilmiş evrim gibi daha güçlü teknolojilerinin geliştirilmesini motive etmiştir. Bu yöntemle üretilen büyük veri setleri, sekans-fonksiyon ilişkilerinin makine öğrenmesi ile modellenebilmesinin önünü açmıştır. Bu tezin amacı bu veri setlerine uygun bir makine öğrenmesi akışı oluşturmaktır. Bu düzlemde Random Forest algoritması ve derin nöral ağlar kullanılmış, eğitilen modellerin bağlanma puanı öngörüleri beraber kullanıldığında mutlak hata sırasıyla, 0.0304, Pearson korelasyon ölçütü 0.904 olarak elde edilmiştir. Bu modelleri kullanarak rastgele arama ve tekrarlayan optimizasyonlar ile güçlü bağlanan örnek bir peptit tasarlanmıştır. Bulgular alan bilgisinin makine öğrenme modeli eğitimdeki yerini vurgulamış, kullanılan örnek ağırlıklarının ve semantik amino asit vektörlerinin başarıya önemli katkıları gözlemlenmiştir. Bu çalışma çeşitli fonksiyonlara sahip peptit tasarlayabilen bir platform oluşturabilmek için temel noktaları göz önüne serer.
Peptides are molecular entities with a diverse set of functionalities vital for biological processes and biotechnological applications. Among their roles, the ability of peptides to bind to solid materials has gathered attention, particularly as building blocks in constructing bio-nano interfaces and molecular linkers. Directed evolution techniques such as iterative phage display, have emerged as capable tools for identifying peptides and proteins with specific affinities for various targets despite its constraints, particularly its low-throughput nature. Those limits have motivated the work on more advanced methodologies such as deep-directed evolution, which integrates high-throughput sequencing. By collecting massive amounts of data, deep-directed evolution provides a broad landscape of sequence information, thus enabling computational modeling and optimization of peptide sequences. This thesis aims to develop machine learning workflows that capture the sequence-function relationship from the data, allowing the design of peptides with desired functionalities. Two machine learning approaches were employed: the Random Forest algorithm (RF) and deep neural networks (DNN). By aggregating binding score predictions from the two models, the predictor achieved a Pearson correlation coefficient of 0.904 and a mean absolute error of 0.0304 on the high- confidence test set and was employed to design a candidate peptide as a proof of principle. Our findings emphasize the importance of including domain knowledge via peptide abundance weighting and amino acid encoding types while designing training strategies. The procedures outlined in this work demonstrate key steps towards designing a peptide sequence-function prediction platform with broad implications for bio-nanotechnology and engineering.

Description

Includes bibliographical references (leaves. 48-56)
Thesis (Master)--İzmir Institute of Technology, Bioengineering, Izmir, 2024
Text in English; Abstract: Turkish and English

ORCID

0009-0004-0577-5195

Keywords

Two-dimensional materials, Peptides, Neural networks, Deep learning, Biomimetic materials, Machine learning methods, Artificial intelligence

End Page

75

URI

https://hdl.handle.net/11147/14984

Collections

Master Degree / Yüksek Lisans Tezleri

Full item page

Page Views

84

checked on Jun 14, 2026

Downloads

96

checked on Jun 14, 2026

Google Scholar™

Check

Machine-learning-assisted de novo design of molybdenum disulfide binding peptides

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

ORCID

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections

Page Views

84

Downloads

96

Google Scholar™

Sustainable Development Goals

SDG data is not available