Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Permanent URI for this collectionhttps://hdl.handle.net/11147/7148
Browse
6 results
Search Results
Article Automating Software Size Measurement from Python Code Using Language Models(Springer, 2025) Tenekeci, Samet; Unlu, Huseyin; Gul, Bedir Arda; Keles, Damla; Kuuk, Murat; Demirors, OnurSoftware size is a key input for project planning, effort estimation, and productivity analysis. While pre-trained language models have shown promise in deriving functional size from natural-language requirements, measuring size directly from source code remains under-explored. Yet, code-based size measurement is critical in modern workflows where requirement documents are often incomplete or unavailable, especially in Agile development environments. This exploratory study investigates the use of CodeBERT, a pre-trained bimodal transformer model, for measuring software size directly from Python source code according to two measurement methods: COSMIC Function Points and MicroM. We construct two curated datasets from the Python subset of the CodeSearchNet corpus, and manually annotate each function with its corresponding size. Our experimental results show that CodeBERT can successfully measure COSMIC data movements with up to 91.4% accuracy and generalize to the functional, architectural, and algorithmic event types defined in MicroM, reaching up to 81.5% accuracy. These findings highlight the potential of code-based language models for automated functional size measurement when requirement artifacts are absent or unreliable.Article An Alternative Software Benchmarking Dataset: Effort Estimation With Machine Learning(Elsevier Science Inc, 2026) Yurum, Ozan Rasit; Unlu, Huseyin; Demirors, OnurEffort estimation plays a vital role in software project planning, as accurate estimates of required human resources are essential for success. Traditional estimation models often depend on historical size and effort data, yet organizations frequently struggle to access reliable effort records. Public benchmarking datasets like ISBSG offer useful data but may lack coverage or involve licensing fees. To address this issue, we previously introduced a free, extendable benchmarking dataset that integrates functional size and effort data extracted from 18 studies. In this study, we examine the effectiveness of our dataset for predictive effort estimation and compare it with the widely used ISBSG dataset. Our analysis includes 337 records from our dataset and 732 ISBSG projects, focusing on those with COSMIC size data. We first developed and compared models using linear regression and nine machine learning algorithms - Bayesian Ridge, Ridge Regression, Decision Tree, Random Forest, XGBoost, LightGBM, k-Nearest Neighbors, Multi-Layer Perceptron, and Support Vector Regression. Then, we selected the best-performing models and applied them to an unseen evaluation dataset to assess their generalization performance. The results show that machine learning performance varies based on evaluation method and dataset characteristics. Despite having fewer records, our dataset enabled more accurate predictions than ISBSG in most cases, highlighting its potential for effort estimation. This study demonstrates the viability of our dataset for building predictive models and supports the use of machine learning in improving estimation accuracy. Expanding this dataset could offer a valuable, open-access resource for organizations seeking effective and lowcost estimation solutions.Conference Object Citation - WoS: 3Citation - Scopus: 5Predicting Software Functional Size Using Natural Language Processing: an Exploratory Case Study(IEEE, 2024) Unlu, Huseyin; Tenekeci, Samet; Ciftci, Can; Oral, Ibrahim Baran; Atalay, Tunahan; Hacaloglu, Tuna; Demirors, OnurSoftware Size Measurement (SSM) plays an essential role in software project management as it enables the acquisition of software size, which is the primary input for development effort and schedule estimation. However, many small and medium-sized companies cannot perform objective SSM and Software Effort Estimation (SEE) due to the lack of resources and an expert workforce. This results in inadequate estimates and projects exceeding the planned time and budget. Therefore, organizations need to perform objective SSM and SEE using minimal resources without an expert workforce. In this research, we conducted an exploratory case study to predict the functional size of software project requirements using state-of-the-art large language models (LLMs). For this aim, we fine-tuned BERT and BERT_SE with a set of user stories and their respective functional size in COSMIC Function Points (CFP). We gathered the user stories included in different project requirement documents. In total size prediction, we achieved 72.8% accuracy with BERT and 74.4% accuracy with BERT_SE. In data movement-based size prediction, we achieved 87.5% average accuracy with BERT and 88.1% average accuracy with BERT_SE. Although we use relatively small datasets in model training, these results are promising and hold significant value as they demonstrate the practical utility of language models in SSM.Conference Object Citation - WoS: 1Citation - Scopus: 1Towards the Construction of a Software Benchmarking Dataset Via Systematic Literature Review(IEEE, 2024) Yurum, Ozan Rasit; Unlu, Huseyin; Demirors, OnurEffort estimation is a fundamental task during the planning of software projects. Prediction models usually rely on two essential factors: software size and effort data. Measuring the size of the software can be done at various stages of the project with desired accuracy. Nevertheless, the industry faces challenges when it comes to collecting reliable actual effort data. Consequently, organizations encounter difficulties in establishing effort prediction models. Benchmarking datasets are available, but, in most cases, they have huge variances that make them less useful for effort prediction. In this study, we aimed to answer whether creating a software benchmarking dataset is possible by gathering the data from the literature. To the best of our knowledge, a comprehensive dataset that gathers the functional size and effort data of the studies from the literature is unavailable. For this purpose, we performed a systematic literature review to find studies that include projects measured with the COSMIC Functional Size Measurement (FSM) method and the related effort. As a result, we formed a dataset including 337 records from 18 studies that shared the corresponding size and effort data. Although we performed a limited search, we created a larger dataset than many datasets in the literature. In light of our review, we obtained that most studies did not share their dataset, and many lacked case details such as implementation environment and the scope of software development life cycle activities included in the effort data. We also compared the dataset with the ISBSG repository and found that our dataset has less variation in productivity. Our review showed the applicability of creating a software benchmarking dataset is possible by gathering the data from the literature. In conclusion, this study addresses gaps in the literature through a cost-free and easily extendable dataset.Conference Object Citation - WoS: 2Citation - Scopus: 2Effort Prediction With Limited Data: a Case Study for Data Warehouse Projects(IEEE, 2022) Unlu, Huseyin; Yildiz, Ali; Demirors, OnurOrganizations may create a sustainable competitive advantage against competitors by using data warehouse systems with which they can assess the current status of their operations at any moment. They can analyze trends and connections using up-to-date data. However, data warehouse projects tend to fail more often than other projects as it can be tough to estimate the effort required to build a data warehouse system. Functional size measurement is one of the methods used as an input for estimating the amount of work in a software project. In this study, we formed a measurement basis for DWH projects in an organization based on the COSMIC Functional Size Measurement Method. We mapped COSMIC rules on two different architectures used for DWH projects in the organization and measured the size of the projects. We calculated the productivity of the projects and compared them with the organization's previous projects and DWH projects in the ISBSG repository. We could not create an organization-wide effort estimation model as we had a limited number of projects. As an alternative, we evaluated the success of effort estimation using DWH projects in the ISBSG repository. We also reported the challenges we faced during the size measurement process.Conference Object Citation - WoS: 7Citation - Scopus: 12Utilization of Three Software Size Measures for Effort Estimation in Agile World: a Case Study(IEEE, 2022) Unlu, Huseyin; Hacaloglu, Tuna; Buber, Fatma; Berrak, Kivilcim; Leblebici, Onur; Demirors, OnurFunctional size measurement (FSM) methods, by being systematic and repeatable, are beneficial in the early phases of the software life cycle for core project management activities such as effort, cost, and schedule estimation. However, in agile projects, requirements are kept minimal in the early phases and are detailed over time as the project progresses. This situation makes it challenging to identify measurement components of FSM methods from requirements in the early phases, hence complicates applying FSM in agile projects. In addition, the existing FSM methods are not fully compatible with today's architectural styles, which are evolving into event-driven decentralized structures. In this study, we present the results of a case study to compare the effectiveness of different size measures: functional -COSMIC Function Points (CFP)-, event-based - Event Points-, and code length-based - Line of Code (LOC)- on projects that were developed with agile methods and utilized a microservice- based architecture. For this purpose, we measured the size of the project and created effort estimation models based on three methods. It is found that the event-based method estimated effort with better accuracy than the CFP and LOC-based methods.
