Electrical - Electronic Engineering / Elektrik - Elektronik Mühendisliği
Permanent URI for this collectionhttps://hdl.handle.net/11147/11
Browse
15 results
Search Results
Research Project Klasik Türk Müziği Kayıtlarının Otomatik Olarak Notaya Dökülmesi ve Otomatik Makam Tanıma(2010) Bozkurt, Barış; Savacı, Ferit Acar; Karaosmanoğlu, Mustafa KemalBu projede Klasik Türk müziği kayıtlarının otomatik olarak notaya dökülmesi ve makamların otomatik olarak tanınması için literatürde ilk defa kullanılan yöntem ve teknikler önerilmiş, yazılımlar gerçeklenmiştir. Bu amaçlara ulaşabilmek için bir dizi problem derinlemesine incelenmiştir. Öncelikle temel titreşim frekans(f0) analizi için literatürde varolan teknikler denenerek en uygun algoritma seçilmiştir. Bu algoritma ile elde edilen sonuçları iyileştirmek için bazı süzgeçler tasarlanmış ve önemli iyileştirmeler sağlanmıştır. Bunu takiben f0 bilgisinden f0 dağılımları(kullanım sıklıkları) elde edilmiş, f0 dağılımlarını kullanarak karar sesi tespiti, kuram - icra uyum düzeyi ölçümü ve otomatik makam tanıması yapan özgün araçlar tasarlanmıştır. Literatürde ilk defa 5 ayrı kuram ve 9 sık kullanılan makamdan güvenilir kayıtlar içeren veri setleri üzerinde kuram - icra uyum düzeyi detaylı olarak incelenmiştir. Yine ilk olarak birçok hesaplamalı müzikoloji çalışmasında kullanılabilecek sembolik bir Türk müziği veritabanı hazırlanmış ve paylaşıma açılmıştır. Otomatik notaya dökme uygulaması için gerekli olan başlangıç noktası tespit algoritması, f0 nicemleme yöntemi ve MIDI’ye dönüştürme araçları geliştirilmiştir.Conference Object Citation - WoS: 2Türk Makam Müziği Notaları için Otomatik Ezgi Bölütleme(Institute of Electrical and Electronics Engineers Inc., 2014) Bozkurt, Barış; Karaçalı, Bilge; Karaosmanoğlu, M. Kemal; Ünal, ErdemAutomatic melodic segmentation is one of the important steps in computational analysis of melodic content from symbolic data This widely studied research problem has been very rarely considered for Turkish makam music. In this paper we first present test results for state-of-the-art techniques from literature on Turkish makam music data Then, we present a statistical classification-based segmentation system that exploits the link between makant melodies and usul and makam scale hierarchies together with the well-known features in literature. We show through tests on a large dataset that the proposed system has a higher accuracy.Conference Object Citation - Scopus: 6Klasi̇k Türk Müzi̇ği̇ İ̇çin Otomati̇k Notaya Dökme Si̇stemi̇(Institute of Electrical and Electronics Engineers, 2011) Bozkurt, Barış; Gedik, Ali Cenk; Karaosmanoğlu, M. KemalThis study presents an automatic transcription system for Turkish music for the first time in literature. We first discuss the characteristics of Turkish music that are taken into consideration in the design of the system. Then, the following signal processing components of the system are described briefly in relation to each other and explaining their function in the system: f0 estimation, automatic tonic detection and makam recognition based on pitch distributions, frequency and duration quantization. © 2011 IEEE.Article Citation - WoS: 67Citation - Scopus: 78Chirp Group Delay Analysis of Speech Signals(Elsevier, 2007) Bozkurt, Barış; Couvreur, Laurent; Dutoit, ThierryThis study proposes new group delay estimation techniques that can be used for analyzing resonance patterns of short-term discrete-time signals and more specifically speech signals. Phase processing or equivalently group delay processing of speech signals are known to be difficult due to large spikes in the phase/group delay functions that mask the formant structure. In this study, we first analyze in detail the z-transform zero patterns of short-term speech signals in the z-plane and discuss the sources of spikes on group delay functions, namely the zeros closely located to the unit circle. We show that windowing largely influences these patterns, therefore short-term phase processing. Through a systematic study, we then show that reliable phase/group delay estimation for speech signals can be achieved by appropriate windowing and group delay functions can reveal formant information as well as some of the characteristics of the glottal flow component in speech signals. However, such phase estimation is highly sensitive to noise and robust extraction of group delay based parameters remains difficult in real acoustic conditions even with appropriate windowing. As an alternative, we propose processing of chirp group delay functions, i.e. group delay functions computed on a circle other than the unit circle in z-plane, which can be guaranteed to be spike-free. We finally present one application in feature extraction for automatic speech recognition (ASR). We show that chirp group delay representations are potentially useful for improving ASR performance. (c) 2007 Elsevier B.V. All rights reserved.Article Citation - WoS: 2Citation - Scopus: 2A Computational Analysis of Turkish Makam Music Based on a Probabilistic Characterization of Segmented Phrases(Taylor and Francis Ltd., 2015) Bozkurt, Barış; Karaçalı, BilgeThis study targets automatic analysis of Turkish makam music pieces on the phrase level. While makam is most simply defined as an organization of melodic phrases, there has been very little effort to computationally study melodic structure in makam music pieces. In this work, we propose an automatic analysis algorithm that takes as input symbolic data in the form of machine-readable scores that are segmented into phrases. Using a measure of makam membership for phrases, our method outputs for each phrase the most likely makam the phrase comes from. The proposed makam membership definition is based on Bayesian classification and the algorithm is specifically designed to process the data with overlapping classes. The proposed analysis system is trained and tested on a large data set of phrases obtained by transferring phrase boundaries manually written by experts of makam music on printed scores, to machine-readable data. For the task of classifying all phrases, or only the beginning phrases to come from the main makam of the piece, the corresponding F-measures are.52 and.60 respectively.Article Citation - WoS: 5Citation - Scopus: 8Usul and Makam Driven Automatic Melodic Segmentation for Turkish Music(Taylor and Francis Ltd., 2014) Bozkurt, Barış; Karaosmanoglu, M. Kemal; Karaçalı, Bilge; Ünal, ErdemAutomatic melodic segmentation is a topic studied extensively, aiming at developing systems that perform grouping of musical events. Here, we consider the problem of automatic segmentation via supervised learning from a dataset containing segmentation labels of an expert. We present a statistical classification-based segmentation system developed specifically for Turkish makam music. The proposed system uses two novel features, a makam-based and an usul-based feature, together with features commonly used in literature. The makam-based feature is defined as the probability of a note to appear at the phrase boundary, computed from the distributions of boundaries with respect to the piece’s makam pitches. Likewise, the usul-based feature is computed from the distributions of boundaries with respect to beats in the rhythmic cycle, usul of the piece. Several experimental setups using different feature groups are designed to test the contribution of the proposed features on three datasets. The results show that the new features carry complementary information to existing features in the literature within the Turkish makam music segmentation context and that the inclusion of new features resulted in statistically significant performance improvement.Article Citation - WoS: 43Citation - Scopus: 59Causal-Anticausal Decomposition of Speech Using Complex Cepstrum for Glottal Source Estimation(Elsevier Ltd., 2011) Drugman, Thomas; Bozkurt, Barış; Dutoit, ThierryComplex cepstrum is known in the literature for linearly separating causal and anticausal components. Relying on advances achieved by the Zeros of the Z-Transform (ZZT) technique, we here investigate the possibility of using complex cepstrum for glottal flow estimation on a large-scale database. Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met. It is also shown that this complex cepstral decomposition gives similar glottal estimates as obtained with the ZZT method. However, as complex cepstrum uses FFT operations instead of requiring the factoring of high-degree polynomials, the method benefits from a much higher speed. Finally in our tests on a large corpus of real expressive speech, we show that the proposed method has the potential to be used for voice quality analysis.Article Citation - WoS: 86Citation - Scopus: 101A Comparative Study of Glottal Source Estimation Techniques(Elsevier Ltd., 2012) Drugman, Thomas; Bozkurt, Barış; Dutoit, ThierryAbstract: Source-tract decomposition (or glottal flow estimation) is one of the basic problems of speech processing. For this, several techniques have been proposed in the literature. However, studies comparing different approaches are almost nonexistent. Besides, experiments have been systematically performed either on synthetic speech or on sustained vowels. In this study we compare three of the main representative state-of-the-art methods of glottal flow estimation: closed-phase inverse filtering, iterative and adaptive inverse filtering, and mixed-phase decomposition. These techniques are first submitted to an objective assessment test on synthetic speech signals. Their sensitivity to various factors affecting the estimation quality, as well as their robustness to noise are studied. In a second experiment, their ability to label voice quality (tensed, modal, soft) is studied on a large corpus of real connected speech. It is shown that changes of voice quality are reflected by significant modifications in glottal feature distributions. Techniques based on the mixed-phase decomposition and on a closed-phase inverse filtering process turn out to give the best results on both clean synthetic and real speech signals. On the other hand, iterative and adaptive inverse filtering is recommended in noisy environments for its high robustness. © 2011 Elsevier Ltd. All rights reserved.Article Citation - WoS: 13Citation - Scopus: 20Weighing Diverse Theoretical Models on Turkish Maqam Music Against Pitch Measurements: a Comparison of Peaks Automatically Derived From Frequency Histograms With Proposed Scale Tones(Taylor and Francis Ltd., 2009) Bozkurt, Barış; Yarman, Ozan; Karaosmanoğlu, M. Kemal; Akkoç, CanSince the early 20th century, various theories have been advanced in order to mathematically explain and notate modes of Traditional Turkish music known as maqams. In this article, maqam scales according to various theoretical models based on different tunings are compared with pitch measurements obtained from select recordings of master Turkish performers in order to study their level of match with analysed data. Chosen recordings are subjected to a fully computerized sequence of signal processing algorithms for the automatic determination of the set of relative pitches for each maqam scale: f0 estimation, histogram computation, tonic detection + histogram alignment, and peak picking. For nine well-recognized maqams, automatically derived relative pitches are compared with scale tones defined by theoretical models using quantitative distance measures. We analyse and interpret histogram peaks based on these measures to find the theoretical models most conforming with all the recordings, and hence, with the quotidian performance trends influenced by them.Article Citation - WoS: 36Citation - Scopus: 64Pitch-Frequency Histogram-Based Music Information Retrieval for Turkish Music(Elsevier Ltd., 2010) Gedik, Ali Cenk; Bozkurt, BarışThis study reviews the use of pitch histograms in music information retrieval studies for western and non-western music. The problems in applying the pitch-class histogram-based methods developed for western music to non-western music and specifically to Turkish music are discussed in detail. The main problems are the assumptions used to reduce the dimension of the pitch histogram space, such as, mapping to a low and fixed dimensional pitch-class space, the hard-coded use of western music theory, the use of the standard diapason (A4=440 Hz), analysis based on tonality and tempered tuning. We argue that it is more appropriate to use higher dimensional pitch-frequency histograms without such assumptions for Turkish music. We show in two applications, automatic tonic detection and makam recognition, that high dimensional pitch-frequency histogram representations can be successfully used in Music Information Retrieval (MIR) applications without such pre-assumptions, using the data-driven models. © 2009 Elsevier B.V. All rights reserved.
