Master Degree / Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/11147/3008

Browse

Search Results

Now showing 1 - 8 of 8
  • Master Thesis
    Drum Accompaniment Generation Using Midi Music Database and Swquence To Sequence Neural Network
    (Izmir Institute of Technology, 2022) Akyüz, Yavuz Batuhan; Gümüştekin, Şevket
    This thesis aims to create an artificial intelligence model to reinterpret the drum parts of musical pieces and/or to accompany music with new uniquely generated drum patterns. Besides providing rhythmic indicators, drum parts are essential to emphasize emotions. Every instrument in a musical composition is in harmony with each other to be meaningful as a whole. Based on this observation, in this thesis, a MIDI dataset and an LSTM based Seq2Seq model were used to create a link between different instruments and drums. Before the training, we created a dataset involving midi pieces with drum parts and grouped them as input and output, which are non-drum instruments, and drum parts respectively. The model was trained with six different genres and the teacher forcing method was utilized to improve the training. After the training, at the generation stage, we made it possible to adjust the complexity of the generated drum parts by changing the temperature value, which we called the complexity value, using the temperature sampling method. We also created a user interface with an instrument selection pane to give users control over the drum instruments generated. Moreover, we proposed a novel approach to generalize the idea for not only MIDI data but also WAV data. To accomplish this task, Mel-spectrogram, MFCC, and tempogram features were used. Both proposed methods are shown to produce high-quality unique drum accompaniments for different genres with adjustable complexity and freedom of choosing the desired drum instruments.
  • Master Thesis
    Real Time Texture Mapped 3d Reconstruction Using a Setup With Mirrors and Controlled Lighting
    (01. Izmir Institute of Technology, 2021) Yazar, Barış; Gümüştekin, Şevket
    The purpose of this thesis is to create a 3D reconstruction framework that can be used in real time. This is accomplished using a parallel implementation of a shape from silhouette (SFS) algorithm. The number of silhouettes employed in reconstruction makes a major contribution to the quality at the expense of reduced speed. In order to keep this number at the minimum level without extensively sacrificing quality, a novel system is introduced. This system is based on evenly distributed viewpoints using a regular tetrahedron structure. In order to reduce cost and simplify camera calibration, we used a single camera setup with three mirrors thus creating virtual cameras for three of four viewpoints. Besides taking advantage of minimal number of viewpoints, parallel hardware is utilized to achieve real time speed. A volume based SFS algorithm is implemented using CUDA parallel computing platform.
  • Master Thesis
    Location Aware Multimedia Content Production Using Geotagged Scenes in Conjunction With Maps and Aerial Imagery
    (Izmir Institute of Technology, 2019) Çelik, Lütfi Sefa; Gümüştekin, Şevket
    The availability of various web services and applications have been attracting a large number of internet users who also contribute to the production of a massive amount of accessible content. Although the share of the multimedia contents is considerable in this data pool, the search for the desired multimedia contents can still be a challenging task. One of the most frequent motivations of internet users seeking for still photographs or videos is to create a feeling of remote presence. These people need to put a significant effort into collecting different types of files, such as action videos, aerial scenes, and still photographs. When image and video files are geotagged (tagged with location data), they can be associated with maps. Current technologies allow us to determine and record information about camera position and orientation. In this thesis, it is aimed to solve problems related to the production and visualization of the location aware multimedia content. Our approach involves using available devices and custom-designed hardware for positioning via Global Positioning System sensors and Inertial Measurement Unit sensors to create query-ready multimedia content. Images and videos, along with location and orientation data, are organized in a database. The designed web application reorganizes the queried multimedia content using the relative location aware data, then creates animations to give guidance on a path for travelers. Therefore, visualization of the multimedia contents on the map according to the movement animation on the trip path also provides the remote presence experience along with the information about travel.
  • Master Thesis
    Multimedia Player Implementation on Embedded Systems
    (Izmir Institute of Technology, 2008) Tetik, Yusuf Engin; Gümüştekin, Şevket
    There has been a surge in the number of digital audio and video content in recent years. Advances in the compression and storage technologies and improvements in the speed of internet connection have enabled widespread use of multimedia content. A wide variety of devices have been introduced to decode and play these media contents.Initially designed as a mere voice communication device, the mobile phones nowadays come equipped with a variety of multimedia capabilities including media players despite their limited system resources.Nowadays, huge servers host dramatically increased audio and video contents Users prefer to watch these contents while streaming rather than downloading them first. So, streaming media players are responsible to present multimedia contents without annoying interrupts.This thesis firstly introduces challenges in design and implementation of a streaming media player and then proposes solutions. Main challenges are keeping audio-video synchronization and server-client synchronization and detecting stream type, handling of multithreaded operations and buffer management. Audio-video synchronization problem is solved by using audio as master stream. Server-client synchronization problem is solved by designing a playback mechanism that keeps synchronization with the server by tuning the playback rate of a streaming media without losing lip-sync between audio and video. The proposed streaming player also has a feature of identifying the type of a media stream very rapidly without using a discrete stream inspector module. The presented design is heavily multithreaded which is implemented on Linux platform, moreover it is also convenient for and implementable on any multithreaded platform.
  • Master Thesis
    An Approach To Summarize Video Data in Compressed Domain
    (Izmir Institute of Technology, 2007) Şimşek, Gökhan; Gümüştekin, Şevket; Gümüştekin, Şevket
    The requirements to represent digital video and images efficiently and feasibly have collected great efforts on research, development and standardization over past 20 years. These efforts targeted a vast area of applications such as video on demand, digital TV/HDTV broadcasting, multimedia video databases, surveillance applications etc. Moreover, the applications demand more efficient collections of algorithms to enable lower bit rate levels, with acceptable quality depending on application requirements. In our time, most of the video content either stored, transmitted is in compressed form. The increase in the amount of video data that is being shared attracted interest of researchers on the interrelated problems of video summarization, indexing and abstraction. In this study, the scene cut detection in emerging ISO/ITU H264/AVC coded bit stream is realized by extracting spatio-temporal prediction information directly in the compressed domain. The syntax and semantics, parsing and decoding processes of ISO/ITU H264/AVC bit-stream is analyzed to detect scene information. Various video test data is constructed using Joint Video Team.s test model JM encoder, and implementations are made on JM decoder. The output of the study is the scene information to address video summarization, skimming, indexing applications that use the new generation ISO/ITU H264/AVC video.
  • Master Thesis
    A Case Study on Logging Visual Activities: Chess Game
    (Izmir Institute of Technology, 2005) Ozan, Şükrü; Gümüştekin, Şevket
    Automatically recognizing and analyzing visual activities in complex nenvironments is a challenging and open-ended problem. In this thesis this problem domain is visited in a chess game scenario where the rules, actions and the environment are well defined. The purpose here is to detect and observe a FIDE (Federation International des Echecs) compatible chess board and to generate a log file of the moves made by human players. A series of basic image processing operations have been applied to perform the desired task. The first step of automatically detecting a chess board is followed by locating the positions of the pieces. After the initial setup is established every move made by a player is automatically detected and verified. A PC-CAM connected to a PC is used to acquire images and implement the corresponding software. For convenience, .Intel R Open Source Computer Vision Library (OpenCV). is used in the current software implementation.
  • Master Thesis
    Automatic Matching of Aerial Coastline Images With Map Data
    (Izmir Institute of Technology, 2005) Kahraman, Metin; Gümüştekin, Şevket
    Matching aerial images with map data is an important task in remote sensing applications such as georeferencing, cartography and autonomous navigation of aerial vehicles. The most distinctive image features that can be used to accomplish this task are due to the unique structures of different coastline segments. In recent years several studies are conducted for detecting coastlines and matching them to map data. The results reported by these studies are far from being a complete solution, having weak points such as poor noise sensitivity, need for user interaction, dependence to a fixed scale and orientation.In this thesis, a two-step procedure involving automatic multiresolution coastline extraction and coastline matching using dynamic programming have been proposed. In the proposed coastline extraction method, sea and land textures are segmented by using cooccurrence and histogram features of the wavelet image representation. The coastlines are identified as the boundaries of the sea regions. For the coastline matching, shape descriptors are investigated and a shape matching method using dynamic programming is adapted. Proposed automatic coastline extraction and coastline matching methods are tested using a vector map of the Aegean coast of Turkey.
  • Master Thesis
    European terrestrial digital television receiver performance comparison study under strong multipath interference
    (Izmir Institute of Technology, 2011) Karakuş, Oktay; Gümüştekin, Şevket
    The main purpose of this thesis is to implement a complete simulation of the European Digital Terrestrial Broadcasting standards known as "Digital Video Broadcasting -- Terrestrial" (DVB-T) and "Second Generation Digital Video Broadcasting -- Terrestrial" (DVB-T2). These standards have been developed primarily for Europe (especially DVB-T in early 90s), but these two standards are in the process of getting wider acceptance among several countries in Africa, Asia, Middle East and Oceania as well as Europe. One of the most important aspects of these standards is the Orthogonal Frequency Division Multiplexing (OFDM) which has been simulated in this thesis. A comparative study between DVB-T and DVB-T2 has been provided in detail when both of these transmission standards are exposed to strong multipath interference. Various strong multipath scenarios have been created and simulated with respect to different power delay profiles and mobility conditions. The strong multipath indicates a channel profile whose scattered paths have strong power and relatively high delay spread. It has been shown that DVB-T2 standard outperforms DVB-T standard under strong multipath interference and achieves nearly from three to nine decibels power gain according to applied channel profile, code rate and modulation parameters.