Automating Software Size Measurement With Language Models: Insights From Industrial Case Studies

Demirörs, Onur; Tenekeci, Samet; Kennouche, Dhia Eddine; Demirors, Onur

doi:10.1016/j.jss.2025.112638

Automating Software Size Measurement With Language Models: Insights From Industrial Case Studies

Date

2026

Authors

Demirörs, Onur

Tenekeci, Samet

Kennouche, Dhia Eddine

Demirors, Onur

Publisher

Elsevier Science Inc

Abstract

Objective software size measurement is critical for accurate effort estimation, yet many organizations avoid it due to high costs, required expertise, and time-consuming manual effort. This often leads to vague predictions, poor planning, and project overruns. To address this challenge, we investigate the use of pre-trained language models - BERT and SE-BERT - to automate size measurement based on textual requirements using COSMIC and MicroM methods. We constructed one heterogeneous dataset and two industrial datasets, each manually measured by experienced analysts. Models were evaluated in three settings: (i) generic model evaluation, where the models are trained and tested on heterogeneous data, (ii) internal evaluation, where the models are trained and tested on organization-specific data, and (iii) external evaluation, where generic models were tested on organization-specific data. Results show that organization-specific models significantly outperform generic models, indicating that aligning training data with the target organization's requirement style is critical for accuracy. SE-BERT, a domain-adapted variant of BERT, improves performance, particularly in low-resource settings. These findings highlight the practical potential of tailoring training data for broader adoption and cost-effective software size measurement in industrial contexts.

Keywords

Software Size Measurement, COSMIC, MICROM, Natural Language Processing, NLP, BERT, Case Study

WoS Q

Q1

Scopus Q

N/A

OpenCitations Citation Count

N/A

Source

Journal of Systems and Software

Volume

231

URI

https://doi.org/10.1016/j.jss.2025.112638

Collections

WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

PlumX Metrics

Citations

Scopus : 1

Captures

Mendeley Readers : 2

Full item page

SCOPUS™ Citations

1

checked on Jun 15, 2026

Web of Science™ Citations

1

checked on Jun 15, 2026

Page Views

5

checked on Jun 15, 2026

Google Scholar™

Check

Automating Software Size Measurement With Language Models: Insights From Industrial Case Studies

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

OpenCitations Citation Count

Source

Volume

Issue

Start Page

End Page

URI

Collections

PlumX Metrics

Citations

Captures

SCOPUS™ Citations

1

Web of Science™ Citations

1

Page Views

5

Google Scholar™

OpenAlex FWCI

0.0

Sustainable Development Goals

SDG data is not available