Vision-Language Model Approach for Few-Shot Learning of Attention Deficit Hyperactivity Disorder Using EEG Connectivity-Based Featured Images

Catal, Mehmet Sergen; Gumus, Abdurrahman; Karabiber Cura, Ozlem; Aydin, Ocan; Zubeyir Unlu, Mehmet

doi:10.1088/2632-2153/ae15e5

Vision-Language Model Approach for Few-Shot Learning of Attention Deficit Hyperactivity Disorder Using EEG Connectivity-Based Featured Images

dc.contributor.author	Catal, Mehmet Sergen
dc.contributor.author	Gumus, Abdurrahman
dc.contributor.author	Karabiber Cura, Ozlem
dc.contributor.author	Aydin, Ocan
dc.contributor.author	Zubeyir Unlu, Mehmet
dc.date.accessioned	2025-11-25T15:11:00Z
dc.date.available	2025-11-25T15:11:00Z
dc.date.issued	2025
dc.description.abstract	Traditional medical diagnosis approaches have predominantly relied on single-modality analysis, limiting clinicians to interpreting isolated data streams such as images or time series. The integration of vision language models (VLMs) into neurophysiological analysis represents a paradigm shift toward multimodal diagnostic frameworks, enabling clinicians to interact with diagnosis models through diverse modalities including text, audio, visual inputs, etc. This multimodal interaction capability extends beyond conventional label-based classification, offering clinicians flexibility in diagnostic reasoning and decision-making processes. Building on this foundation, this study explores the application of VLMs to electroencephalography (EEG)-based attention deficit hyperactivity disorder (ADHD) classification, addressing a gap in neurophysiological diagnostics. The proposed framework applies VLM-based few-shot ADHD classification by converting raw EEG data into EEG connectivity-based featured images compatible with contrastive language-image pre-training's (CLIP) image encoder. The adaptor-based CLIP approach (Tip-Adapter and Tip-Adapter-F) for few-shot learning improves CLIP's zero-shot classification performance, achieving 78.73% accuracy with 1-shot and 98.30% accuracy with 128-shot using the RN50x16 backbone. Experiments investigate prompt engineering effects, backbone architectures of CLIP, patient-based classification, and combinations of EEG connectivity features. Comparative analysis is performed with two datasets to evaluate the approach between different data sources. Through the adaptation of pre-trained VLMs to neurophysiological data, this technique demonstrates the potential for multimodal diagnostic frameworks that enable flexible clinician-model interactions beyond conventional label-based classification systems. The approach achieves effective ADHD classification with minimal training data while establishing foundations for applying VLMs in clinical neuroscience, where diverse modality interactions through text, visual, and audio inputs can enhance diagnostic workflows. The code is publicly available on GitHub to facilitate further research in the field: https://github.com/miralab-ai/vlm-few-shot-eeg.	en_US
dc.identifier.doi	10.1088/2632-2153/ae15e5
dc.identifier.issn	2632-2153
dc.identifier.scopus	2-s2.0-105020797018
dc.identifier.uri	https://doi.org/10.1088/2632-2153/ae15e5
dc.identifier.uri	https://hdl.handle.net/11147/18657
dc.language.iso	en	en_US
dc.publisher	IOP Publishing Ltd	en_US
dc.relation.ispartof	Machine Learning-Science and Technology	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Vision Language Models	en_US
dc.subject	Few-Shot Learning	en_US
dc.subject	Electroencephalography	en_US
dc.subject	Attention Deficit Hyperactivity Disorder	en_US
dc.subject	Connectivity-Based Features	en_US
dc.title	Vision-Language Model Approach for Few-Shot Learning of Attention Deficit Hyperactivity Disorder Using EEG Connectivity-Based Featured Images
dc.type	Article	en_US
dspace.entity.type	Publication
gdc.author.scopusid	57350690900
gdc.author.scopusid	35315599800
gdc.author.scopusid	57195223021
gdc.author.scopusid	60171574400
gdc.author.scopusid	55411870500
gdc.author.wosid	Gumus, Abdurrahman/Kgl-2848-2024
gdc.coar.type	text::journal::journal article
gdc.collaboration.industrial	false
gdc.description.department	İzmir Institute of Technology	en_US
gdc.description.departmenttemp	[Catal, Mehmet Sergen; Aydin, Ocan; Zubeyir Unlu, Mehmet] Izmir Inst Technol, Dept Elect & Elect Engn, Izmir, Turkiye; [Gumus, Abdurrahman] Isparta Univ Appl Sci, Dept Comp Engn, Isparta, Turkiye; [Karabiber Cura, Ozlem] Izmir Katip Celebi Univ, Dept Biomed Engn, Izmir, Turkiye	en_US
gdc.description.issue	4	en_US
gdc.description.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
gdc.description.scopusquality	Q2
gdc.description.volume	6	en_US
gdc.description.woscitationindex	Science Citation Index Expanded
gdc.description.wosquality	Q1
gdc.identifier.openalex	W4415391613
gdc.identifier.wos	WOS:001607545900001
gdc.index.type	WoS
gdc.index.type	Scopus
gdc.openalex.fwci	0.0
gdc.openalex.normalizedpercentile	0.41
gdc.opencitations.count	0
gdc.plumx.mendeley	4
gdc.plumx.scopuscites	0
gdc.scopus.citedcount	0
gdc.wos.citedcount	0
relation.isAuthorOfPublication.latestForDiscovery	ce5ce1e2-17ef-4da2-946d-b7a26e44e461
relation.isOrgUnitOfPublication.latestForDiscovery	9af2b05f-28ac-4003-8abe-a4dfe192da5e

Collections

WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Vision-Language Model Approach for Few-Shot Learning of Attention Deficit Hyperactivity Disorder Using EEG Connectivity-Based Featured Images

Files

Collections