Phd Degree / Doktora

Permanent URI for this collectionhttps://hdl.handle.net/11147/2869

Browse

Search Results

Now showing 1 - 3 of 3
  • Doctoral Thesis
    Classification of Maneuvers of Vehicles in Front for Driver Assistance Systems
    (01. Izmir Institute of Technology, 2023) Nalçakan, Yağız; Baştanlar, Yalın
    Predicting vehicle maneuvers is a critical task for developing autonomous driving. These maneuvers have been identified as leading causes of fatal accidents, underscoring the need for robust and reliable detection systems. This thesis addresses this critical issue by developing and evaluating novel methodologies for classifying maneuvers, especially lane change and cut-in maneuvers in front of the vehicle. Two specific methods are proposed in this thesis work, and their effectiveness is evaluated on two datasets: the Prevention Lane Change Prediction dataset and the BDD-100K Cut-in/Lane-pass Classification Subset. The first method is a model that utilizes features extracted from the bounding boxes of the target vehicle, feeding them into a single-layer LSTM network for cut-in/lane-pass classification. The second method involves training a 3-dimensional residual neural network in a self-supervised manner using contrastive video representation learning. For the self-supervised training phase, a novel scene representation is proposed to highlight vehicle motions. Afterward, the same model is fine-tuned using labeled video data. Lastly, an ensemble learning approach is introduced, which combines the predictive capabilities of the proposed LSTM-based and self-supervised contrastive video representation learning models, leveraging the strengths of both methods to enhance the overall maneuver classification performance. The proposed methods made significant contributions to the field. The LSTM-based model achieved high classification accuracies compared to other studies in the literature. The self-supervised video representation learning model represents the first application of contrastive learning in maneuver classification. The ensemble learning approach has shown a significant improvement in the performance of the maneuver detection system.
  • Doctoral Thesis
    Semantic Segmentation of Panoramic Images and Panoramic Image Based Outdoor Visual Localization
    (01. Izmir Institute of Technology, 2022) Orhan, Semih; Baştanlar, Yalın
    360-degree views are captured by full omnidirectional cameras and generally represented with panoramic images. Unfortunately, these images heavily suffer from the spherical distortion at the poles of the sphere. In previous studies of Convolutional Neural Networks (CNNs), several methods have been proposed (e.g. equirectangular convolution) to alleviate spherical distortion. Getting inspired from these previous efforts, we developed an equirectangular version of the UNet model. We evaluated the semantic segmentation performance of the UNet model and its equirectangular version on an outdoor panoramic dataset. Experimental results showed that the equirectangular version of UNet performed better than UNet. In addition, we released the pixel-level annotated dataset, which is one of the first semantic segmentation datasets of outdoor panoramic images. In visual localization, localizing perspective query images in a panoramic image dataset can alleviate the non-overlapping view problem between cameras. Generally, perspective query images are localized in a panoramic image database with generating its virtual 4 or 8 gnomonic views, which is deforming sphere into cube faces. Doing so can simplify the searching problem to perspective to perspective search, but still there might be a non-overlapping view problem between query and gnomonic database images. Therefore we propose directly localizing perspective query images in panoramic images by applying sliding windows on the last convolution layer of CNNs. Features are extracted with R-MAC, GeM, and SFRS. Experimental results showed that the sliding window approach outperformed 4-gnomonic views, and we get competitive results compared with 8 and 12 gnomonic views. Any city-scale visual localization system has to be robust against long-term changes. Semantic information is more robust to such changes (e.g. surface of the building), and the depth maps provide geometric clues. In our work, we utilized semantic and depth information while pose verification, that is checking semantic and depth similarity to verify the poses (retrievals) obtained with the approach that use only RGB image features. Semantic and depth information are represented with a self-supervised contrastive learning approach (SimCLR). Experimental results showed that pose verification with semantic and depth features improved the visual localization performance of the RGB-only model.
  • Doctoral Thesis
    Improved Image Based Localization Using Semantic Descriptors
    (Izmir Institute of Technology, 2021) Çınaroğlu, İbrahim; Baştanlar, Yalın
    Place recognition and Visual Localization (VL) for autonomous driving are the topics that keep their popularity in the field of Computer Vision. In this study, semantically improved Hybrid-VL approaches, that use localization aware semantic information in street-level driving images are proposed. Initially, Semantic Descriptor (SD) is extracted from semantically segmented images with a Convolutional Neural Network (CNN) trained for localization task. Then, image retrieval based VL task is performed using the approximate nearest neighbor search (ANNS) in 2D-2D matching context. This proposed method is named as SD-VL and its success is compared with the success of the state-of-the-art Local Descriptor (LD) based VL method (LD-VL) which is frequently used in the literature. Furthermore, with the aim of alleviating the shortcomings of both two methods, a novel decision-level Hybrid-VL (Hybrid-VL_DL ) method is proposed by combining SD-VL and LD-VL in post-processing stage. Also feature-level Hybrid-VL (Hybrid-VL_FL ) method is proposed in order to produce automatically tuned hybrid result. These proposed VL methods are examined on two challenging benchmarks; RobotCar Seasons and Malaga Downtown Data Sets. Moreover, a new VL data set Malaga Streetview Challenge is generated by collecting Google Streetview images on the same path of Malaga Downtown in order to observe impact of environmental and wide-baseline changes. This newly generated test set will be useful for researchers studying in this field. After all, the proposed semantically boosted Hybrid-VL_DL method is able to increase localization performance on both RobotCar Seasons and Malaga Streetview Challenge data sets by 11.6% and 4.5% Top-1 recall@5, and 4% and 5.4% recall@1 scores respectively. Additionally, reliability of our hyper-parameter (W) based Hybrid-VL_DL approach is supported by very close performance of the Hybrid-VL_FL method.