ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18247
27
0

AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text

24 March 2025
Tadesse Destaw Belay
Israel Abebe Azime
I. Ahmad
Idris Abdulmumin
A. Ayele
Shamsuddeen Hassan Muhammad
Seid Muhie Yimam
ArXivPDFHTML
Abstract

Pretrained Language Models (PLMs) built from various sources are the foundation of today's NLP progress. Language representations learned by such models achieve strong performance across many tasks with datasets of varying sizes drawn from various sources. We explore a thorough analysis of domain and task adaptive continual pretraining approaches for low-resource African languages and a promising result is shown for the evaluated tasks. We create AfriSocial, a corpus designed for domain adaptive finetuning that passes through quality pre-processing steps. Continual pretraining PLMs using AfriSocial as domain adaptive pretraining (DAPT) data, consistently improves performance on fine-grained emotion classification task of 16 targeted languages from 1% to 28.27% macro F1 score. Likewise, using the task adaptive pertaining (TAPT) approach, further finetuning with small unlabeled but similar task data shows promising results. For example, unlabeled sentiment data (source) for fine-grained emotion classification task (target) improves the base model results by an F1 score ranging from 0.55% to 15.11%. Combining the two methods, DAPT + TAPT, achieves also better results than base models. All the resources will be available to improve low-resource NLP tasks, generally, as well as other similar domain tasks such as hate speech and sentiment tasks.

View on arXiv
@article{belay2025_2503.18247,
  title={ AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text },
  author={ Tadesse Destaw Belay and Israel Abebe Azime and Ibrahim Said Ahmad and Idris Abdulmumin and Abinew Ali Ayele and Shamsuddeen Hassan Muhammad and Seid Muhie Yimam },
  journal={arXiv preprint arXiv:2503.18247},
  year={ 2025 }
}
Comments on this paper