Incorporating Concreteness in Multi-Modal Language Models With Curriculum Learning

Loading...

Date

Authors

Sezerer, Erhan
Tekir, Selma

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

GOLD

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Over the last few years, there has been an increase in the studies that consider experiential (visual) information by building multi-modal language models and representations. It is shown by several studies that language acquisition in humans starts with learning concrete concepts through images and then continues with learning abstract ideas through the text. In this work, the curriculum learning method is used to teach the model concrete/abstract concepts through images and their corresponding captions to accomplish multi-modal language modeling/representation. We use the BERT and Resnet-152 models on each modality and combine them using attentive pooling to perform pre-training on the newly constructed dataset, which is collected from the Wikimedia Commons based on concrete/abstract words. To show the performance of the proposed model, downstream tasks and ablation studies are performed. The contribution of this work is two-fold: A new dataset is constructed from Wikimedia Commons based on concrete/abstract words, and a new multi-modal pre-training approach based on curriculum learning is proposed. The results show that the proposed multi-modal pre-training approach contributes to the success of the model.

Description

Keywords

Multi-modal dataset, Wikimedia Commons, Multi-modal language model, Concreteness, Curriculum learning, Wikimedia Commons, concreteness, Technology, QH301-705.5, T, Physics, QC1-999, Engineering (General). Civil engineering (General), multi-modal language model, multi-modal dataset, Chemistry, curriculum learning, TA1-2040, Biology (General), QD1-999

Fields of Science

02 engineering and technology, 0202 electrical engineering, electronic engineering, information engineering

Citation

WoS Q

Scopus Q

OpenCitations Logo
OpenCitations Citation Count
1

Volume

11

Issue

17

Start Page

End Page

PlumX Metrics
Citations

CrossRef : 2

Scopus : 2

Captures

Mendeley Readers : 8

SCOPUS™ Citations

2

checked on May 02, 2026

Web of Science™ Citations

2

checked on May 02, 2026

Page Views

945

checked on May 02, 2026

Downloads

263

checked on May 02, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.20443896

Sustainable Development Goals

INDUSTRY, INNOVATION AND INFRASTRUCTURE9
INDUSTRY, INNOVATION AND INFRASTRUCTURE