Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Permanent URI for this collectionhttps://hdl.handle.net/11147/7148

Browse

Search Results

Now showing 1 - 8 of 8

Vision-Language Model Approach for Few-Shot Learning of Attention Deficit Hyperactivity Disorder Using EEG Connectivity-Based Featured Images
(IOP Publishing Ltd, 2025) Catal, Mehmet Sergen; Gumus, Abdurrahman; Karabiber Cura, Ozlem; Aydin, Ocan; Zubeyir Unlu, Mehmet
Traditional medical diagnosis approaches have predominantly relied on single-modality analysis, limiting clinicians to interpreting isolated data streams such as images or time series. The integration of vision language models (VLMs) into neurophysiological analysis represents a paradigm shift toward multimodal diagnostic frameworks, enabling clinicians to interact with diagnosis models through diverse modalities including text, audio, visual inputs, etc. This multimodal interaction capability extends beyond conventional label-based classification, offering clinicians flexibility in diagnostic reasoning and decision-making processes. Building on this foundation, this study explores the application of VLMs to electroencephalography (EEG)-based attention deficit hyperactivity disorder (ADHD) classification, addressing a gap in neurophysiological diagnostics. The proposed framework applies VLM-based few-shot ADHD classification by converting raw EEG data into EEG connectivity-based featured images compatible with contrastive language-image pre-training's (CLIP) image encoder. The adaptor-based CLIP approach (Tip-Adapter and Tip-Adapter-F) for few-shot learning improves CLIP's zero-shot classification performance, achieving 78.73% accuracy with 1-shot and 98.30% accuracy with 128-shot using the RN50x16 backbone. Experiments investigate prompt engineering effects, backbone architectures of CLIP, patient-based classification, and combinations of EEG connectivity features. Comparative analysis is performed with two datasets to evaluate the approach between different data sources. Through the adaptation of pre-trained VLMs to neurophysiological data, this technique demonstrates the potential for multimodal diagnostic frameworks that enable flexible clinician-model interactions beyond conventional label-based classification systems. The approach achieves effective ADHD classification with minimal training data while establishing foundations for applying VLMs in clinical neuroscience, where diverse modality interactions through text, visual, and audio inputs can enhance diagnostic workflows. The code is publicly available on GitHub to facilitate further research in the field: https://github.com/miralab-ai/vlm-few-shot-eeg.
Semantic Guided Autoregressive Diffusion Based Data Augmentation Using Visual Instructions
(Institute of Electrical and Electronics Engineers Inc., 2025) Yavuzcan, Ege; Kus, Omer; Gumus, Abdurrahman
Recent breakthroughs in generative image models, especially those based on diffusion techniques, have radically transformed the landscape of text-guided image synthesis by delivering exceptional fidelity and detailed semantic control. In this study, we present an iterative editing framework that harnesses the inherent strengths of these generative models to progressively refine images with precision. Our approach begins by generating diverse textual descriptions from an initial image, from which the most effective prompt is selected to drive further refinement through a fine-tuned Stable Diffusion process. This pipeline, as detailed in our flow diagram, orchestrates a series of controlled image modifications that preserve the original context while accommodating deliberate stylistic and semantic adjustments. By cycling the augmented output back into the system, our method achieves a harmonious balance between innovation and consistency, paving the way for highquality, context-aware visual transformations. This dynamic, auto-regressive strategy underscores the transformative potential of modern image generation models for applications that require detailed, controlled creative expression. The code is available on Github. © 2025 Elsevier B.V., All rights reserved.
Iterative Semantic Refinement: A Vision Language Model-Driven Approach to Auto-Regressive Image Editing
(Institute of Electrical and Electronics Engineers Inc., 2025) Yavuzcan, Ege; Kus, Omer; Gumus, Abdurrahman
Recent advancements in Visual Language Models (VLMs) have significantly improved text-to-image generation by enabling more nuanced and semantically rich textual prompts, highlighting the transformative impact of these models on image synthesis. In this work, we leverage these robust capabilities to develop an auto-regressive editing framework that systematically refines images through careful, step-by-step modifications. Our method concisely balances subtle adjustments with meaningful semantic shifts, ensuring that each editing stage preserves the core context while introducing precise variations. By integrating improvements from controllable image editing models, we enhance the precision and stability of our edits and demonstrate the effectiveness of our approach in maintaining visual coherence. This integration results in a powerful strategy for producing diverse, high-quality outputs that align with finely tuned semantic goals. Centered on the strength of VLMs, this framework opens up a new paradigm for image synthesis, offering a blend of creative flexibility and consistent contextual fidelity that holds promise for a variety of applications requiring intricate and controlled visual transformations. © 2025 Elsevier B.V., All rights reserved.
Vision Transformers-Based Deep Feature Generation Framework for Hydatid Cyst Classification in Computed Tomography Images
(Springer, 2025) Sagik, Metin; Gumus, Abdurrahman
Hydatid cysts, caused by Echinococcus granulosus, form progressively enlarging fluid-filled cysts in organs like the liver and lungs, posing significant public health risks through severe complications or death. This study presents a novel deep feature generation framework utilizing vision transformer models (ViT-DFG) to enhance the classification accuracy of hydatid cyst types. The proposed framework consists of four phases: image preprocessing, feature extraction using vision transformer models, feature selection through iterative neighborhood component analysis, and classification, where the performance of the ViT-DFG model was evaluated and compared across different classifiers such as k-nearest neighbor and multi-layer perceptron (MLP). Both methods were evaluated independently to assess classification performance from different approaches. The dataset, comprising five cyst types, was analyzed for both five-class and three-class classification by grouping the cyst types into active, transition, and inactive categories. Experimental results showed that the proposed VIT-DFG method achieves higher accuracy than existing methods. Specifically, the ViT-DFG framework attained an overall classification accuracy of 98.10% for the three-class and 95.12% for the five-class classifications using 5-fold cross-validation. Statistical analysis through one-way analysis of variance (ANOVA), conducted to evaluate significant differences between models, confirmed significant differences between the proposed framework and individual vision transformer models (p<0.05\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p < 0.05$$\end{document}). These results highlight the effectiveness of combining multiple vision transformer architectures with advanced feature selection techniques in improving classification performance. The findings underscore the ViT-DFG framework's potential to advance medical image analysis, particularly in hydatid cyst classification, while offering clinical promise through automated diagnostics and improved decision-making.
Citation - WoS: 2
Vis-Assist: Computer Vision and Haptic Feedback-Based Wearable Assistive Device for Visually Impaired
(Springer, 2025) Dede, Ibrahim; Gumus, Abdurrahman
Visual impairment affects millions of people worldwide, posing significant challenges in their daily lives and personal safety. While assistive technologies, both wearable and non-wearable, can help mitigate these challenges, wearable devices offer the advantage of hands-free operation. In this context, we present Vis-Assist, a novel wearable visual assistive device capable of detecting and classifying objects, measuring their distances, and providing real-time haptic feedback through a vibration motor array, all using an integrated low-cost computational unit without the need for external servers. Our study distinguishes itself by utilizing haptic feedback to convey object information, allowing visually impaired individuals to discern between 19 different object classes following a brief training period. Haptic feedback offers an alternative to audio that doesn't block hearing and can be used alongside it, serving as a complementary solution. The performance of the developed wearable device was evaluated through two types of experiments with four participants. The results demonstrate that users can identify the location of objects and thereby prevent collisions with obstacles. The experiments conducted demonstrate that users, on average, can locate a predefined object, such as a chair, within a 40 m2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {m}<^>{2}$$\end{document} vacant space in under 94 seconds. Furthermore, users exhibit proficiency in finding objects while navigating around obstacles in the same environment, achieving this task in less than 121 seconds on average. The system developed here has high potential to help the self-navigation of visually impaired people and make their daily lives easier. To facilitate further research in this field, the complete source code for this study has been made publicly available on GitHub.
Understanding the Synthesis Mechanism of Arginine Functionalized Silver/Silver Chloride Nanoparticles Using Sugar Ligands
(Elsevier, 2025) Bolat, Suheda; Degirmenci, Suna; Gumus, Abdurrahman; Sancak, Zafer; Yazgan, Dris
In this study, we performed a mechanistic study to understand how the sugar ligand chemistry affected the morphology, size and surface chemistry of Ag/AgCl_NPs synthesized in the presence of L-Arginine hydrochloride and L-Arginine/KCl mixture. The sugar ligands Lactose p-methoxyaniline (LMA) and Galactose 5-aminosalicylic acid (G5AS) resulted in formation of sheet-like Ag/AgCl_NPs while Lactose sulfanilic acid (LSA) and Lactose psulfonyldianiline (LPSA) caused the formation of anisotropic and film-like Ag/AgCl_NPs. The UV-Vis based mechanistic studies showed that the presence of Arginine posed a strong effect on how G5AS and LMA ligands interact with silver ions while the effect was more complicated for the LSA and LPSA ligands due to the fact that they form complexation with Ag+ ions. The mechanism was further investigated using infrared (IR) studies that showed the increases in Argine and chloride ion concentrations resulted in differentiation of the surface chemistry of the Ag/AgCl_NPs, and appearance of Arginine related IR bands became clearer in the case of cointroduction of Arginine and the sugar ligands. The characterized nanoparticles were then used as antibacterial agent for multidrug resistant Escherichia coli species for which less than 10 mu M minimum inhibitory concentrations were obtained. The promising antibacterial activity, which could be assigned to the presence of Arginine, was independent from the sugar ligand chemistry and nanoparticles' morphology and size. Particularly, large Ag/AgCl_NP film forming capacity can call further research to be exploited as coating materials for antibacterial application.
Citation - WoS: 4
Citation - Scopus: 5
Diffusion-Based Data Augmentation Methodology for Improved Performance in Ocular Disease Diagnosis Using Retinography Images
(Springer Heidelberg, 2024) Aktas, Burak; Ates, Doga Deniz; Duzyel, Okan; Gumus, Abdurrahman
Deep learning models, integral components of contemporary technological landscapes, exhibit enhanced learning capabilities with larger datasets. Traditional data augmentation techniques, while effective in generating new data, have limitations, especially in fields like ocular disease diagnosis. In response, alternative augmentation approaches, including the utilization of generative AI, have emerged. In our study, we employed a diffusion-based model (Stable Diffusion) to synthesize data by faithfully recreating crucial vascular structures in the retina, vital for detecting eye diseases by using the Ocular Disease Intelligent Recognition dataset. Our goal was to augment retinography images for ocular disease diagnosis using diffusion-based models, optimizing the outputs of the fine-tuned Stable Diffusion model, and ensuring the generated data closely resembles real-world scenarios. This strategic approach resulted in improved performance in classification models and augmentation outperformed traditional methods, exhibiting high precision rates ranging from 85% to 76.2% and recall values of 86%, and 75% for 5 classes. Beyond performance enhancement, we demonstrated that the inclusion of synthetic data, coupled with data reduction using the t-SNE method, effectively addressed dataset imbalance. As a result of synthetic data addition, notable increases of 3.4% in the precision metric and 12.8% in the recall metric were observed in the 7-class case. Strategically synthesizing data addressed underrepresented classes, creating a balanced dataset for comprehensive model learning. Surpassing performance improvements, this approach underscores synthetic data's ability to overcome the limitations of traditional methods, particularly in sensitive medical domains like ocular disease diagnosis, ensuring accurate classification. The codes of the study will be shared on GitHub in a way that benefits everyone interested: https://github.com/miralab-ai/generative-data-augmentation.
Development of Low-Cost Portable Blood Vessel Imaging System
(IEEE, 2021) Altay, Ayse; Gumus, Abdurrahman
As an alternative to high-cost near-infrared (NIR) vascular imaging devices in the market [1], a microcomputerbased, real-time, low-cost, non-contact and safe vascular imaging system has been developed. The higher absorption coefficient of blood from skin and fat, as well as the differences in oxy and deoxyhemoglobin spectra in blood, were helpful factors in the use of the NIR region during the acquisition of vessel images. A device, which uses NIR LED light operated at 850 nm, was designed using optical and electronic components. Image analysis were performed using OpenCV, which is an open-source software library, and data visualization libraries. Tests were carried out to optimize the best imaging conditions for the device. In this study, a portable device design with improved vessel image quality is presented which could potentially be used to assist the health professionals to investigate the abnormalities in the superficial vascular structures at different times during patients' treatments.

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Browse

Filters

Settings

Sort By

Results per page

Search Results