Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Permanent URI for this collectionhttps://hdl.handle.net/11147/7148
Browse
Search Results
Conference Object Iterative Semantic Refinement: A Vision Language Model-Driven Approach to Auto-Regressive Image Editing(Institute of Electrical and Electronics Engineers Inc., 2025) Yavuzcan, Ege; Kus, Omer; Gumus, AbdurrahmanRecent advancements in Visual Language Models (VLMs) have significantly improved text-to-image generation by enabling more nuanced and semantically rich textual prompts, highlighting the transformative impact of these models on image synthesis. In this work, we leverage these robust capabilities to develop an auto-regressive editing framework that systematically refines images through careful, step-by-step modifications. Our method concisely balances subtle adjustments with meaningful semantic shifts, ensuring that each editing stage preserves the core context while introducing precise variations. By integrating improvements from controllable image editing models, we enhance the precision and stability of our edits and demonstrate the effectiveness of our approach in maintaining visual coherence. This integration results in a powerful strategy for producing diverse, high-quality outputs that align with finely tuned semantic goals. Centered on the strength of VLMs, this framework opens up a new paradigm for image synthesis, offering a blend of creative flexibility and consistent contextual fidelity that holds promise for a variety of applications requiring intricate and controlled visual transformations. © 2025 Elsevier B.V., All rights reserved.
