Automating Modern Code Review Processes With Code Similarity Measurement

Kartal,Y.; Akdeniz,E.K.; Özkan,K.

doi:10.1016/j.infsof.2024.107490

Automating Modern Code Review Processes With Code Similarity Measurement

Date

2024

Authors

Kartal,Y.

Akdeniz,E.K.

Özkan,K.

Publisher

Elsevier B.V.

Green Open Access

No

Publicly Funded

No

Impulse

Average

Influence

Average

Popularity

Average

Abstract

Context: Modern code review is a critical component in software development processes, as it ensures security, detects errors early and improves code quality. However, manual reviews can be time-consuming and unreliable. Automated code review can address these issues. Although deep-learning methods have been used to recommend code review comments, they are expensive to train and employ. Instead, information retrieval (IR)-based methods for automatic code review are showing promising results in efficiency, effectiveness, and flexibility. Objective: Our main objective is to determine the optimal combination of the vectorization method and similarity to measure what gives the best results in an automatic code review, thereby improving the performance of IR-based methods. Method: Specifically, we investigate different vectorization methods (Word2Vec, Doc2Vec, Code2Vec, and Transformer) that differ from previous research (TF-IDF and Bag-of-Words), and similarity measures (Cosine, Euclidean, and Manhattan) to capture the semantic similarities between code texts. We evaluate the performance of these methods using standard metrics, such as Blue, Meteor, and Rouge-L, and include the run-time of the models in our results. Results: Our results demonstrate that the Transformer model outperforms the state-of-the-art method in all standard metrics and similarity measurements, achieving a 19.1% improvement in providing exact matches and a 6.2% improvement in recommending reviews closer to human reviews. Conclusion: Our findings suggest that the Transformer model is a highly effective and efficient approach for recommending code review comments that closely resemble those written by humans, providing valuable insight for developing more efficient and effective automated code review systems. © 2024 Elsevier B.V.

Keywords

WoS Q

Q1

Scopus Q

N/A

OpenCitations Citation Count

N/A

Source

Information and Software Technology

Volume

173

URI

https://doi.org/10.1016/j.infsof.2024.107490
https://hdl.handle.net/11147/14571

Collections

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

PlumX Metrics

Citations

Scopus : 4

Captures

Mendeley Readers : 25

Full item page

SCOPUS™ Citations

4

checked on Jun 12, 2026

Web of Science™ Citations

4

checked on Jun 12, 2026

Page Views

53

checked on Jun 12, 2026

Google Scholar™

Check

Automating Modern Code Review Processes With Code Similarity Measurement

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

Green Open Access

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

BIP! Indicators

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

OpenCitations Citation Count

Source

Volume

Issue

Start Page

End Page

URI

Collections

PlumX Metrics

Citations

Captures

SCOPUS™ Citations

4

Web of Science™ Citations

4

Page Views

53

Google Scholar™

OpenAlex FWCI

7.63825132

Sustainable Development Goals

SDG data is not available