Computer Engineering / Bilgisayar Mühendisliği
Permanent URI for this collectionhttps://hdl.handle.net/11147/10
Browse
23 results
Search Results
Now showing 1 - 10 of 23
Book Part Citation - Scopus: 2Dementia Detection With Deep Networks Using Multi-Modal Image Data(CRC Press, 2023) Yiğit, Altuğ; Işık, Zerrin; Baştanlar, YalınNeurodegenerative diseases give rise to irreversible neural damage in the brain. By the time it is diagnosed, the disease may have progressed. Although there is no complete treatment for many types of neurodegenerative diseases, by detecting the disease in its early stages, treatments can be applied to relieve some symptoms or prevent disease progression. Many invasive and non-invasive methods are employed for the diagnosis of dementia. Computer-assisted diagnostic systems make the diagnosis based on volumetric features (structural or functional) or some two-dimensional brain perspectives obtained from a single image modality. This chapter firstly introduces a broad review of multi-modal imaging approaches proposed for dementia diagnosis. Then it presents deep neural networks, which extract structural and functional features from multi-modal imaging data, are employed to diagnose Alzheimer’s and mild cognitive impairments. While MRI scans are safer than most types of scans and provide structural information about the human body, PET scans provide information about functional activities in the brain. Thus, the setup has been designed to make experiments using both MRI and FDG-PET scans. Performances of multi-modal models were compared with single-modal solutions. The multi-modal solution showed superiority over single-modals due to the advantage of focusing on assorted features. © 2023 selection and editorial matter, Jyotismita Chaki; individual chapters, the contributors.Article Citation - Scopus: 3Cut-In Maneuver Detection With Self-Supervised Contrastive Video Representation Learning(Springer, 2023) Nalçakan, Yağız; Baştanlar, YalınThe detection of the maneuvers of the surrounding vehicles is important for autonomous vehicles to act accordingly to avoid possible accidents. This study proposes a framework based on contrastive representation learning to detect potentially dangerous cut-in maneuvers that can happen in front of the ego vehicle. First, the encoder network is trained in a self-supervised fashion with contrastive loss where two augmented videos of the same video clip stay close to each other in the embedding space, while augmentations from different videos stay far apart. Since no maneuver labeling is required in this step, a relatively large dataset can be used. After this self-supervised training, the encoder is fine-tuned with our cut-in/lane-pass labeled datasets. Instead of using original video frames, we simplified the scene by highlighting surrounding vehicles and ego-lane. We have investigated the use of several classification heads, augmentation types, and scene simplification alternatives. The most successful model outperforms the best fully supervised model by ∼ 2% with an accuracy of 92.52%Conference Object Citation - WoS: 7Citation - Scopus: 6Semantic Pose Verification for Outdoor Visual Localization With Self-Supervised Contrastive Learning(IEEE, 2022) Guerrero, Jose J.; Orhan, Semih; Baştanlar, YalınAny city-scale visual localization system has to overcome long-term appearance changes, such as varying illumination conditions or seasonal changes between query and database images. Since semantic content is more robust to such changes, we exploit semantic information to improve visual localization. In our scenario, the database consists of gnomonic views generated from panoramic images (e.g. Google Street View) and query images are collected with a standard field-of-view camera at a different time. To improve localization, we check the semantic similarity between query and database images, which is not trivial since the position and viewpoint of the cameras do not exactly match. To learn similarity, we propose training a CNN in a self-supervised fashion with contrastive learning on a dataset of semantically segmented images. With experiments we showed that this semantic similarity estimation approach works better than measuring the similarity at pixel-level. Finally, we used the semantic similarity scores to verify the retrievals obtained by a state-of-the-art visual localization method and observed that contrastive learning-based pose verification increases top-1 recall value to 0.90 which corresponds to a 2% improvement.Article Citation - WoS: 8Citation - Scopus: 9Dementia diagnosis by ensemble deep neural networks using FDG-PET scans(Springer, 2022) Yiğit, Altuğ; Baştanlar, Yalın; Işık, ZerrinDementia is a type of brain disease that affects the mental abilities. Various studies utilize PET features or some two-dimensional brain perspectives to diagnose dementia. In this study, we have proposed an ensemble approach, which employs volumetric and axial perspective features for the diagnosis of Alzheimer’s disease and the patients with mild cognitive impairment. We have employed deep learning models and constructed two disparate networks. The first network evaluates volumetric features, and the second network assesses grid-based brain scan features. Decisions of these networks were combined by an adaptive majority voting algorithm to create an ensemble learner. In the evaluations, we compared ensemble networks with single ones as well as feature fusion networks to identify possible improvement; as a result, the ensemble method turned out to be promising for making a diagnostic decision. The proposed ensemble network achieved an average accuracy of 91.83% for the diagnosis of Alzheimer’s disease; to the best of our knowledge, it is the highest diagnosis performance in the literature.Article Citation - WoS: 7Citation - Scopus: 8Long-Term Image-Based Vehicle Localization Improved With Learnt Semantic Descriptors(Elsevier, 2022) Çınaroğlu, İbrahim; Baştanlar, YalınVision based solutions for the localization of vehicles have become popular recently. In this study, we employ an image retrieval based visual localization approach, in which database images are kept with GPS coordinates and the location of the retrieved database image serves as the position estimate of the query image in a city scale driving scenario. Regarding this approach, most existing studies only use descriptors extracted from RGB images and do not exploit semantic content. We show that localization can be improved via descriptors extracted from semantically segmented images, especially when the environment is subjected to severe illumination, seasonal or other long-term changes. We worked on two separate visual localization datasets, one of which (Malaga Streetview Challenge) has been generated by us and made publicly available. Following the extraction of semantic labels in images, we trained a CNN model for localization in a weakly-supervised fashion with triplet ranking loss. The optimized semantic descriptor can be used on its own for localization or preferably it can be used together with a state-of-the-art RGB image based descriptor in hybrid fashion to improve accuracy. Our experiments reveal that the proposed hybrid method is able to increase the localization performance of the standard (RGB image based) approach up to 7.7% regarding Top-1 Recall values.Article Citation - WoS: 43Citation - Scopus: 47Semantic Segmentation of Outdoor Panoramic Images(Springer, 2021) Orhan, Semih; Baştanlar, YalınOmnidirectional cameras are capable of providing 360. field-of-view in a single shot. This comprehensive view makes them preferable for many computer vision applications. An omnidirectional view is generally represented as a panoramic image with equirectangular projection, which suffers from distortions. Thus, standard camera approaches should be mathematically modified to be used effectively with panoramic images. In this work, we built a semantic segmentation CNN model that handles distortions in panoramic images using equirectangular convolutions. The proposed model, we call it UNet-equiconv, outperforms an equivalent CNN model with standard convolutions. To the best of our knowledge, ours is the first work on the semantic segmentation of real outdoor panoramic images. Experiment results reveal that using a distortion-aware CNN with equirectangular convolution increases the semantic segmentation performance (4% increase in mIoU). We also released a pixel-level annotated outdoor panoramic image dataset which can be used for various computer vision applications such as autonomous driving and visual localization. Source code of the project and the dataset were made available at the project page (https://github.com/semihorhan/semseg-outdoor-pano). © 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.Article Citation - WoS: 3Citation - Scopus: 3Catadioptric Hyperspectral Imaging, an Unmixing Approach(Institution of Engineering and Technology, 2020) Özışık Başkurt, Didem; Baştanlar, Yalın; Yardımcı Çetin, YaseminHyperspectral imaging systems provide dense spectral information on the scene under investigation by collecting data from a high number of contiguous bands of the electromagnetic spectrum. The low spatial resolutions of these sensors frequently give rise to the mixing problem in remote sensing applications. Several unmixing approaches are developed in order to handle the challenging mixing problem on perspective images. On the other hand, omnidirectional imaging systems provide a 360-degree field of view in a single image at the expense of lower spatial resolution. In this study, we propose a novel imaging system which integrates hyperspectral cameras with mirrors so on to yield catadioptric omnidirectional imaging systems to benefit from the advantages of both modes. Catadioptric images, incorporating a camera with a reflecting device, introduce radial warping depending on the structure of the mirror used in the system. This warping causes a non-uniformity in the spatial resolution which further complicates the unmixing problem. In this context, a novel spatial-contextual unmixing algorithm specifically for the large field of view of the hyperspectral imaging system is developed. The proposed algorithm is evaluated on various real-world and simulated cases. The experimental results show that the proposed approach outperforms compared methods.Conference Object Zamanda ortalaması alınmış ikili önplan imgeleri kullanarak taşıt sınıflandırması(IEEE, 2015) Karaimer, Hakkı Can; Baştanlar, YalınWe describe a shape-based method for classification of vehicles from omnidirectional videos. Different from similar approaches, the binary images of vehicles obtained by background subtraction in a sequence of frames are averaged over time. We show with experiments that using the average shape of the object results in a more accurate classification than using a single frame. The vehicle types we classify are motorcycle, car and van. We created an omnidirectional video dataset and repeated experiments with shuffled train-test sets to ensure randomization.Article Citation - WoS: 3Citation - Scopus: 4Affordable person detection in omnidirectional cameras using radial integral channel features(Springer Verlag, 2019) Demiröz, Barış Evrim; Salah, Albert Ali; Baştanlar, Yalın; Akarun, LaleOmnidirectional cameras cover more ground than perspective cameras, at the expense of resolution. Their comprehensive field of view makes omnidirectional cameras appealing for security and ambient intelligence applications. Person detection is usually a core part of such applications. Conventional methods fail for omnidirectional images due to different image geometry and formation. In this study, we propose a method for person detection in omnidirectional images, which is based on the integral channel features approach. Features are extracted from various channels, such as LUV and gradient magnitude, and classified using boosted decision trees. Features are pixel sums inside annular sectors (doughnut slice shapes) contained by the detection window. We also propose a novel data structure called radial integral image that allows to calculate sums inside annular sectors efficiently. We have shown with experiments that our method outperforms the previous state of the art and uses significantly less computational resources.Article Citation - WoS: 3Citation - Scopus: 3Elimination of Useless Images From Raw Camera-Trap Data(Türkiye Klinikleri Journal of Medical Sciences, 2019) Tekeli, Ulaş; Baştanlar, YalınCamera-traps are motion triggered cameras that are used to observe animals in nature. The number of images collected from camera-traps has increased significantly with the widening use of camera-traps thanks to advances in digital technology. A great workload is required for wild-life researchers to group and label these images. We propose a system to decrease the amount of time spent by the researchers by eliminating useless images from raw camera-trap data. These images are too bright, too dark, blurred, or they contain no animals To eliminate bright, dark, and blurred images we employ techniques based on image histograms and fast Fourier transform. To eliminate the images without animals, we propose a system combining convolutional neural networks and background subtraction. We experimentally show that the proposed approach keeps 99% of photos with animals while eliminating more than 50% of photos without animals. We also present a software prototype that employs developed algorithms to eliminate useless images.
- «
- 1 (current)
- 2
- 3
- »
