Search Results

Now showing 1 - 3 of 3

Toward Reliable Annotation in Low-Resource NLP: A Mixture of Agents Framework and Multi-LLM Benchmarking
(IEEE-Inst Electrical Electronics Engineers Inc, 2025) Onan, Aytug; Nasution, Arbi Haza; Celikten, Tugba
This paper introduces the Mixture-of-Agents (MoA) framework, a structured approach for improving the reliability of large language model (LLM)-based text annotation in low-resource NLP contexts. MoA employs coordinated agent interactions to enhance agreement, interpretability, and robustness without manual supervision. Evaluations on Turkish classification benchmarks demonstrate that MoA achieves up to 10-point improvements in macro-F1 over single-model baselines and significantly increases inter-agent consistency. Additionally, three novel reliability metrics-Conflict Rate (CR), Ambiguity Resolution Success Rate (ARSR), and Refinement Correction Rate (RCR)-are proposed to quantify annotation stability and correction dynamics. The results indicate that multi-agent coordination can substantially improve label quality, offering a scalable pathway toward trustworthy annotation in low-resource and cross-domain applications. The framework is language-agnostic and adaptable to other low-resource contexts beyond Turkish, including morphologically rich or typologically diverse languages such as Indonesian, Urdu, and Swahili. These findings highlight the scalability of MoA as a generalizable solution for multilingual and cross-domain annotation.
Citation - WoS: 3
Citation - Scopus: 6
Model-Based Ideal Testing of Gui Programs-Approach and Case Studies
(IEEE-Inst Electrical Electronics Engineers inc, 2021) Kilincceker, Onur; Silistre, Alper; Belli, Fevzi; Challenger, Moharram
Traditionally, software testing is aimed at showing the presence of faults. This paper proposes a novel approach to testing graphical user interfaces (GUI) for showing both the presence and absence of faults in the sense of ideal testing. The approach uses a positive testing concept to show that the GUI under consideration (GUC) does what the user expects; to the contrary, the negative testing concept shows that the GUC does not do anything that the user does not expect, building a holistic view. The first step of the approach models the GUC by a finite state machine (FSM) that enables the model-based generation of test cases. This is always possible as the GUIs are considered as strictly sequential processes. The next step converts the FSM to an equivalent regular expression (RE) that will be analyzed first to construct test selection criteria for excluding redundant test cases and construct test coverage criteria for terminating the positive test process. Both criteria enable us to assess the adequacy and efficiency of the positive tests performed. The negative tests will be realized by systematically mutating the FSM to model faults, the absence of which are to be shown. Those mutant FSMs will be handled and assessed in the same way as in positive testing. Two case studies illustrate and validate the approach; the experiments' results will be analyzed to discuss the pros and cons of the techniques introduced.
Citation - WoS: 1
Citation - Scopus: 1
A User-Assisted Thread-Level Vulnerability Assessment Tool
(Wiley, 2019) Öz, Işıl; Topçuoğlu, Haluk Rahmi; Tosun, Oğuz
The system reliability becomes a critical concern in modern architectures with the scale down of circuits. To deal with soft errors, the replication of system resources has been used at both hardware and software levels. Since the redundancy causes performance degradation, it is required to explore partial redundancy techniques that replicate the most vulnerable parts of the code. The redundancy level of user applications depends on user preferences and may be different for the users with different requirements. In this work, we propose a user-assisted reliability assessment tool based on critical thread analysis for redundancy in parallel architectures. Our analysis evaluates the application threads of a parallel program by considering their criticality in the execution and selects the most critical thread or threads to be replicated. Moreover, we extend our analysis by exploring critical regions of individual threads and execute redundantly only those regions to reduce redundancy overhead. Our experimental evaluation indicates that the replication of the most critical thread improves the system reliability more (up to 10% for blackscholes application) than the replication of any other thread. The partial thread replication based on critical region analysis also reduces the vulnerability of the system by considering a fine-grained approach.

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Browse

Filters

Settings

Sort By

Results per page

Search Results