Master Degree / Yüksek Lisans Tezleri
Permanent URI for this collectionhttps://hdl.handle.net/11147/3008
Browse
Search Results
Now showing 1 - 2 of 2
Master Thesis Comparison of Document Classification Approaches for Turkish Texts(Izmir Institute of Technology, 2015) Çobanoğlu, Özlem Ece; Aslan, Burak GalipInternet usage is exponentially growing day by day. This rapid growth in Internet usage leads to an explosion in the number of electronic documents being produced daily. The huge bulk of documents make it difficult accessing the necessary and relevant information. Due to lack of logical organization, retrieval and processing of the desired information from huge number of documents becomes a complex and time consuming task with human effort. Therefore, document classification is significant task to manage and process the documents. In this thesis, the performance of different classification approaches produced from several algorithms is thoroughly evaluated. The main goal of the thesis is to determine the best combination of document preprocessing steps and classification algorithms. Different feature weighting, construction and selection methods are experimented on Turkish documents. Stemmed and original words and their bi-gram and tri-gram forms are used to construct the features which represent the documents. The effects of several weighting algorithms and the combination of feature selection and weighting algorithms on 3 different classification approaches are interpreted. The performance of 216 different classification process combinations are analyzed. Experimental results show that C4.5 (C4.5 Decision Tree) classification algorithm has the highest accuracy results in 95% of the results. SVM (Support Vector Machine) algorithm produces the closest results to C4.5 and it provides the highest accuracy in 5% of the experimental results. NB (Naive Bayes) algorithm has always the lowest accuracy rate in these 3 different classification algorithm results.Master Thesis News Story Analysis With Credibility Assessment by Opinion Mining(Izmir Institute of Technology, 2015) Sezerer, Erhan; Tekir, SelmaWith the growing influence of media and the popularity and widespread use of social networks, credibility of the news sources became an important subject that needs more attention. The biggest problem of finding credible sources is, instead of giving every aspect of the incident, news sources tend to accept one of the parties’ idea as a whole while rejecting every other ideas, or even worse, they focus on only one side of the incident and ignoring the rest. Credibility is defined as “The quality of believable and trustworthy”. The notion of trustworthiness can further be decomposed into components like bias, fairness, factual/ opinionated, etc. In this thesis, credibility is measured using the fact/opinion ratio of the articles. Two methods, which are the traditional Naive Bayes method and the Relativistic method, are proposed. The intuition of relativistic method comes from the theory of relativity where the sentiment of the articles is determined relatively to the ordinary context used by people in daily speech. We have tested our methods on four different types of data, hand-written articles, editorials, New York Times articles and Reuters articles, and aimed to show that our proposed models are able to differentiate the sentiments in the articles. In the experimental work, we provided a detailed evaluation of the results.
