Toward Reliable Annotation in Low-Resource NLP: A Mixture of Agents Framework and Multi-LLM Benchmarking

Onan, Aytug; Nasution, Arbi Haza; Celikten, Tugba

doi:10.1109/ACCESS.2025.3643829

Toward Reliable Annotation in Low-Resource NLP: A Mixture of Agents Framework and Multi-LLM Benchmarking

Date

2025

Authors

Onan, Aytug

Nasution, Arbi Haza

Celikten, Tugba

Publisher

IEEE-Inst Electrical Electronics Engineers Inc

Abstract

This paper introduces the Mixture-of-Agents (MoA) framework, a structured approach for improving the reliability of large language model (LLM)-based text annotation in low-resource NLP contexts. MoA employs coordinated agent interactions to enhance agreement, interpretability, and robustness without manual supervision. Evaluations on Turkish classification benchmarks demonstrate that MoA achieves up to 10-point improvements in macro-F1 over single-model baselines and significantly increases inter-agent consistency. Additionally, three novel reliability metrics-Conflict Rate (CR), Ambiguity Resolution Success Rate (ARSR), and Refinement Correction Rate (RCR)-are proposed to quantify annotation stability and correction dynamics. The results indicate that multi-agent coordination can substantially improve label quality, offering a scalable pathway toward trustworthy annotation in low-resource and cross-domain applications. The framework is language-agnostic and adaptable to other low-resource contexts beyond Turkish, including morphologically rich or typologically diverse languages such as Indonesian, Urdu, and Swahili. These findings highlight the scalability of MoA as a generalizable solution for multilingual and cross-domain annotation.

Keywords

Annotations, Reliability, Multilingual, Benchmark Testing, Semantics, Pipelines, Natural Language Processing, Cultural Differences, Cognition, Reviews, Annotation Quality, Large Language Models, Low-Resource Languages, Mixture of Agents, Multilingual Natural Language Processing, Natural Language Understanding, Text Classification

WoS Q

Q2

Scopus Q

Q1

OpenCitations Citation Count

N/A

Source

IEEE Access

Volume

13

Start Page

211620

End Page

211644

URI

https://doi.org/10.1109/ACCESS.2025.3643829

Collections

WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

PlumX Metrics

Citations

Scopus : 0

Full item page

Page Views

22

checked on Apr 30, 2026

Google Scholar™

Check

Toward Reliable Annotation in Low-Resource NLP: A Mixture of Agents Framework and Multi-LLM Benchmarking

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

OpenCitations Citation Count

Source

Volume

Issue

Start Page

End Page

URI

Collections

PlumX Metrics

Citations

Page Views

22

Google Scholar™

Sustainable Development Goals

SDG data could not be loaded because of an error. Please refresh the page or try again later.