Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Permanent URI for this collectionhttps://hdl.handle.net/11147/7148
Browse
4 results
Search Results
Article An Alternative Software Benchmarking Dataset: Effort Estimation With Machine Learning(Elsevier Science Inc, 2026) Yurum, Ozan Rasit; Unlu, Huseyin; Demirors, OnurEffort estimation plays a vital role in software project planning, as accurate estimates of required human resources are essential for success. Traditional estimation models often depend on historical size and effort data, yet organizations frequently struggle to access reliable effort records. Public benchmarking datasets like ISBSG offer useful data but may lack coverage or involve licensing fees. To address this issue, we previously introduced a free, extendable benchmarking dataset that integrates functional size and effort data extracted from 18 studies. In this study, we examine the effectiveness of our dataset for predictive effort estimation and compare it with the widely used ISBSG dataset. Our analysis includes 337 records from our dataset and 732 ISBSG projects, focusing on those with COSMIC size data. We first developed and compared models using linear regression and nine machine learning algorithms - Bayesian Ridge, Ridge Regression, Decision Tree, Random Forest, XGBoost, LightGBM, k-Nearest Neighbors, Multi-Layer Perceptron, and Support Vector Regression. Then, we selected the best-performing models and applied them to an unseen evaluation dataset to assess their generalization performance. The results show that machine learning performance varies based on evaluation method and dataset characteristics. Despite having fewer records, our dataset enabled more accurate predictions than ISBSG in most cases, highlighting its potential for effort estimation. This study demonstrates the viability of our dataset for building predictive models and supports the use of machine learning in improving estimation accuracy. Expanding this dataset could offer a valuable, open-access resource for organizations seeking effective and lowcost estimation solutions.Conference Object Citation - WoS: 1Citation - Scopus: 1Towards the Construction of a Software Benchmarking Dataset Via Systematic Literature Review(IEEE, 2024) Yurum, Ozan Rasit; Unlu, Huseyin; Demirors, OnurEffort estimation is a fundamental task during the planning of software projects. Prediction models usually rely on two essential factors: software size and effort data. Measuring the size of the software can be done at various stages of the project with desired accuracy. Nevertheless, the industry faces challenges when it comes to collecting reliable actual effort data. Consequently, organizations encounter difficulties in establishing effort prediction models. Benchmarking datasets are available, but, in most cases, they have huge variances that make them less useful for effort prediction. In this study, we aimed to answer whether creating a software benchmarking dataset is possible by gathering the data from the literature. To the best of our knowledge, a comprehensive dataset that gathers the functional size and effort data of the studies from the literature is unavailable. For this purpose, we performed a systematic literature review to find studies that include projects measured with the COSMIC Functional Size Measurement (FSM) method and the related effort. As a result, we formed a dataset including 337 records from 18 studies that shared the corresponding size and effort data. Although we performed a limited search, we created a larger dataset than many datasets in the literature. In light of our review, we obtained that most studies did not share their dataset, and many lacked case details such as implementation environment and the scope of software development life cycle activities included in the effort data. We also compared the dataset with the ISBSG repository and found that our dataset has less variation in productivity. Our review showed the applicability of creating a software benchmarking dataset is possible by gathering the data from the literature. In conclusion, this study addresses gaps in the literature through a cost-free and easily extendable dataset.Article Citation - WoS: 1Citation - Scopus: 2The Crucial Role of Personal Values on Well-Being and Resilience in the Software Industry(Ieee Computer Soc, 2024) Yurum, Ozan Rasit; Ozcan-Top, OzdenPersonal values play a pivotal role in shaping individuals' behaviors and decisions. This research aims to determine how alignment with personal values in both professional and personal life influences an individual's resilience and well-being in the software industry.Review Citation - WoS: 3Citation - Scopus: 4Predictive Video Analytics in Online Courses: a Systematic Literature Review(SPRINGER, 2023) Yurum, Ozan Rasit; Taskaya-Temizel, Tuğba; Yildirim, SonerThe purpose of this study was to investigate the use of predictive video analytics in online courses in the literature. A systematic literature review was performed based on a hybrid search strategy that included both database searching and backward snowballing. In total, 77 related publications published between 2011 and April 2023 were identified. The findings revealed an increase in the number of publications on predictive video analytics since 2016. In the majority of studies, edX and Coursera platforms were used to collect learners' video interaction data. In addition, computer science was shown to be the top course domain, whilst data collection from a single course was found to be the most common. The results related to input measures showed that pause, play, backward, and forward were the top in-video interactions, whilst video transcript and subtitle were the least used. Learner performance and dropout were the primary output measures, whereas learning variables such as engagement, satisfaction, and motivation were investigated in only a few studies. Furthermore, most of the studies utilized data related to forums, navigation, and exams in addition to video data. The top algorithms used were Support Vector Machine, Random Forest, Logistic Regression, and Recurrent Neural Networks, with Random Forest and Recurrent Neural Networks being two rising algorithms in recent years. The top three evaluation metrics used were Accuracy, Area Under the Curve, and F1 Score. The findings of this study may be used to aid effective learning design and guide future research.
