Computer Engineering / Bilgisayar Mühendisliği

Permanent URI for this collectionhttps://hdl.handle.net/11147/10

Browse

Search Results

Now showing 1 - 10 of 20
  • Article
    Citation - Scopus: 3
    Cut-In Maneuver Detection With Self-Supervised Contrastive Video Representation Learning
    (Springer, 2023) Nalçakan, Yağız; Baştanlar, Yalın
    The detection of the maneuvers of the surrounding vehicles is important for autonomous vehicles to act accordingly to avoid possible accidents. This study proposes a framework based on contrastive representation learning to detect potentially dangerous cut-in maneuvers that can happen in front of the ego vehicle. First, the encoder network is trained in a self-supervised fashion with contrastive loss where two augmented videos of the same video clip stay close to each other in the embedding space, while augmentations from different videos stay far apart. Since no maneuver labeling is required in this step, a relatively large dataset can be used. After this self-supervised training, the encoder is fine-tuned with our cut-in/lane-pass labeled datasets. Instead of using original video frames, we simplified the scene by highlighting surrounding vehicles and ego-lane. We have investigated the use of several classification heads, augmentation types, and scene simplification alternatives. The most successful model outperforms the best fully supervised model by ∼ 2% with an accuracy of 92.52%
  • Conference Object
    Citation - WoS: 7
    Citation - Scopus: 6
    Semantic Pose Verification for Outdoor Visual Localization With Self-Supervised Contrastive Learning
    (IEEE, 2022) Guerrero, Jose J.; Orhan, Semih; Baştanlar, Yalın
    Any city-scale visual localization system has to overcome long-term appearance changes, such as varying illumination conditions or seasonal changes between query and database images. Since semantic content is more robust to such changes, we exploit semantic information to improve visual localization. In our scenario, the database consists of gnomonic views generated from panoramic images (e.g. Google Street View) and query images are collected with a standard field-of-view camera at a different time. To improve localization, we check the semantic similarity between query and database images, which is not trivial since the position and viewpoint of the cameras do not exactly match. To learn similarity, we propose training a CNN in a self-supervised fashion with contrastive learning on a dataset of semantically segmented images. With experiments we showed that this semantic similarity estimation approach works better than measuring the similarity at pixel-level. Finally, we used the semantic similarity scores to verify the retrievals obtained by a state-of-the-art visual localization method and observed that contrastive learning-based pose verification increases top-1 recall value to 0.90 which corresponds to a 2% improvement.
  • Article
    Citation - WoS: 8
    Citation - Scopus: 9
    Dementia diagnosis by ensemble deep neural networks using FDG-PET scans
    (Springer, 2022) Yiğit, Altuğ; Baştanlar, Yalın; Işık, Zerrin
    Dementia is a type of brain disease that affects the mental abilities. Various studies utilize PET features or some two-dimensional brain perspectives to diagnose dementia. In this study, we have proposed an ensemble approach, which employs volumetric and axial perspective features for the diagnosis of Alzheimer’s disease and the patients with mild cognitive impairment. We have employed deep learning models and constructed two disparate networks. The first network evaluates volumetric features, and the second network assesses grid-based brain scan features. Decisions of these networks were combined by an adaptive majority voting algorithm to create an ensemble learner. In the evaluations, we compared ensemble networks with single ones as well as feature fusion networks to identify possible improvement; as a result, the ensemble method turned out to be promising for making a diagnostic decision. The proposed ensemble network achieved an average accuracy of 91.83% for the diagnosis of Alzheimer’s disease; to the best of our knowledge, it is the highest diagnosis performance in the literature.
  • Article
    Citation - WoS: 7
    Citation - Scopus: 8
    Long-Term Image-Based Vehicle Localization Improved With Learnt Semantic Descriptors
    (Elsevier, 2022) Çınaroğlu, İbrahim; Baştanlar, Yalın
    Vision based solutions for the localization of vehicles have become popular recently. In this study, we employ an image retrieval based visual localization approach, in which database images are kept with GPS coordinates and the location of the retrieved database image serves as the position estimate of the query image in a city scale driving scenario. Regarding this approach, most existing studies only use descriptors extracted from RGB images and do not exploit semantic content. We show that localization can be improved via descriptors extracted from semantically segmented images, especially when the environment is subjected to severe illumination, seasonal or other long-term changes. We worked on two separate visual localization datasets, one of which (Malaga Streetview Challenge) has been generated by us and made publicly available. Following the extraction of semantic labels in images, we trained a CNN model for localization in a weakly-supervised fashion with triplet ranking loss. The optimized semantic descriptor can be used on its own for localization or preferably it can be used together with a state-of-the-art RGB image based descriptor in hybrid fashion to improve accuracy. Our experiments reveal that the proposed hybrid method is able to increase the localization performance of the standard (RGB image based) approach up to 7.7% regarding Top-1 Recall values.
  • Article
    Citation - WoS: 43
    Citation - Scopus: 47
    Semantic Segmentation of Outdoor Panoramic Images
    (Springer, 2021) Orhan, Semih; Baştanlar, Yalın
    Omnidirectional cameras are capable of providing 360. field-of-view in a single shot. This comprehensive view makes them preferable for many computer vision applications. An omnidirectional view is generally represented as a panoramic image with equirectangular projection, which suffers from distortions. Thus, standard camera approaches should be mathematically modified to be used effectively with panoramic images. In this work, we built a semantic segmentation CNN model that handles distortions in panoramic images using equirectangular convolutions. The proposed model, we call it UNet-equiconv, outperforms an equivalent CNN model with standard convolutions. To the best of our knowledge, ours is the first work on the semantic segmentation of real outdoor panoramic images. Experiment results reveal that using a distortion-aware CNN with equirectangular convolution increases the semantic segmentation performance (4% increase in mIoU). We also released a pixel-level annotated outdoor panoramic image dataset which can be used for various computer vision applications such as autonomous driving and visual localization. Source code of the project and the dataset were made available at the project page (https://github.com/semihorhan/semseg-outdoor-pano). © 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
  • Article
    Citation - WoS: 3
    Citation - Scopus: 3
    Catadioptric Hyperspectral Imaging, an Unmixing Approach
    (Institution of Engineering and Technology, 2020) Özışık Başkurt, Didem; Baştanlar, Yalın; Yardımcı Çetin, Yasemin
    Hyperspectral imaging systems provide dense spectral information on the scene under investigation by collecting data from a high number of contiguous bands of the electromagnetic spectrum. The low spatial resolutions of these sensors frequently give rise to the mixing problem in remote sensing applications. Several unmixing approaches are developed in order to handle the challenging mixing problem on perspective images. On the other hand, omnidirectional imaging systems provide a 360-degree field of view in a single image at the expense of lower spatial resolution. In this study, we propose a novel imaging system which integrates hyperspectral cameras with mirrors so on to yield catadioptric omnidirectional imaging systems to benefit from the advantages of both modes. Catadioptric images, incorporating a camera with a reflecting device, introduce radial warping depending on the structure of the mirror used in the system. This warping causes a non-uniformity in the spatial resolution which further complicates the unmixing problem. In this context, a novel spatial-contextual unmixing algorithm specifically for the large field of view of the hyperspectral imaging system is developed. The proposed algorithm is evaluated on various real-world and simulated cases. The experimental results show that the proposed approach outperforms compared methods.
  • Conference Object
    Zamanda ortalaması alınmış ikili önplan imgeleri kullanarak taşıt sınıflandırması
    (IEEE, 2015) Karaimer, Hakkı Can; Baştanlar, Yalın
    We describe a shape-based method for classification of vehicles from omnidirectional videos. Different from similar approaches, the binary images of vehicles obtained by background subtraction in a sequence of frames are averaged over time. We show with experiments that using the average shape of the object results in a more accurate classification than using a single frame. The vehicle types we classify are motorcycle, car and van. We created an omnidirectional video dataset and repeated experiments with shuffled train-test sets to ensure randomization.
  • Article
    Citation - WoS: 3
    Citation - Scopus: 4
    Affordable person detection in omnidirectional cameras using radial integral channel features
    (Springer Verlag, 2019) Demiröz, Barış Evrim; Salah, Albert Ali; Baştanlar, Yalın; Akarun, Lale
    Omnidirectional cameras cover more ground than perspective cameras, at the expense of resolution. Their comprehensive field of view makes omnidirectional cameras appealing for security and ambient intelligence applications. Person detection is usually a core part of such applications. Conventional methods fail for omnidirectional images due to different image geometry and formation. In this study, we propose a method for person detection in omnidirectional images, which is based on the integral channel features approach. Features are extracted from various channels, such as LUV and gradient magnitude, and classified using boosted decision trees. Features are pixel sums inside annular sectors (doughnut slice shapes) contained by the detection window. We also propose a novel data structure called radial integral image that allows to calculate sums inside annular sectors efficiently. We have shown with experiments that our method outperforms the previous state of the art and uses significantly less computational resources.
  • Article
    Citation - WoS: 3
    Citation - Scopus: 3
    Elimination of Useless Images From Raw Camera-Trap Data
    (Türkiye Klinikleri Journal of Medical Sciences, 2019) Tekeli, Ulaş; Baştanlar, Yalın
    Camera-traps are motion triggered cameras that are used to observe animals in nature. The number of images collected from camera-traps has increased significantly with the widening use of camera-traps thanks to advances in digital technology. A great workload is required for wild-life researchers to group and label these images. We propose a system to decrease the amount of time spent by the researchers by eliminating useless images from raw camera-trap data. These images are too bright, too dark, blurred, or they contain no animals To eliminate bright, dark, and blurred images we employ techniques based on image histograms and fast Fourier transform. To eliminate the images without animals, we propose a system combining convolutional neural networks and background subtraction. We experimentally show that the proposed approach keeps 99% of photos with animals while eliminating more than 50% of photos without animals. We also present a software prototype that employs developed algorithms to eliminate useless images.
  • Conference Object
    Citation - WoS: 2
    Citation - Scopus: 8
    İnsansız Araçlar için Anlamsal Bölütleme ile İmge Tabanlı Konumlandırma
    (Institute of Electrical and Electronics Engineers Inc., 2019) Çınaroğlu, İbrahim; Baştanlar, Yalın
    Bilgisayarlı Görü alanındaki popülerliğini koruyan araştırma konularından birisi insansız araçlarda yer tespiti ve konumlandırmadır. Araçların konumlandırılmasında kullanılan GPS sistemlerinin bazı durumlarda faal olamadığı bilinen bir gerçektir ve bu yetersizlik imge tabanlı konumlandırma çalışmalarına hız vermiştir. Bizim çalışmamızda, araç içinden elde edilmiş Malaga şehir merkezi görüntülerinden oluşan bir veri tabanı kullanılarak imge tabanlı konumlandırma yapılmıştır. İlk olarak, anlamsal (semantik) bölütleme sonucunda elde edilen bir anlamsal betimleyici oluşturulmuş ve yaklaşık en yakın komşuluk araması tekniği de kullanılarak bir konumlandırma yapılmıştır. Ardından bu yöntemin başarısı, literatürde sıkça kullanılan yerel betimleyici tabanlı yöntemin başarısıyla kıyaslanmıştır. Ayrıca, bu iki yöntemin birleştirilmesi ile elde edilen melez bir yöntem önerilmiştir. Önerilen melez imge-tabanlı konumlandırmanın, sadece yerel betimleyici ve sadece anlamsal betimleyici kullanan yöntemden daha başarılı olduğu, dolayısıyla yerel betimleyici tabanlı yöntemlerin anlamsal betimleyiciler ile desteklenmesinin başarıyı artırdığı, deneysel sonuçlarla gösterilmiştir.