ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.06744
  4. Cited By
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources

Masader: Metadata Sourcing for Arabic Text and Speech Data Resources

13 October 2021
Zaid Alyafeai
Maraim Masoud
Mustafa Ghaleb
Maged S. Al-Shaibani
ArXiv (abs)PDFHTML

Papers citing "Masader: Metadata Sourcing for Arabic Text and Speech Data Resources"

15 / 15 papers shown
Title
MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs
MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs
Zaid Alyafeai
Maged S. Al-Shaibani
Bernard Ghanem
11
0
0
26 May 2025
Estimating the Level of Dialectness Predicts Interannotator Agreement in
  Multi-dialect Arabic Datasets
Estimating the Level of Dialectness Predicts Interannotator Agreement in Multi-dialect Arabic Datasets
Amr Keleg
Walid Magdy
Sharon Goldwater
59
3
0
18 May 2024
Aya Model: An Instruction Finetuned Open-Access Multilingual Language
  Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Ahmet Üstün
Viraat Aryabumi
Zheng-Xin Yong
Wei-Yin Ko
Daniel D'souza
...
Shayne Longpre
Niklas Muennighoff
Marzieh Fadaee
Julia Kreutzer
Sara Hooker
ALMELMSyDaLRM
98
230
0
12 Feb 2024
Toxic language detection: a systematic review of Arabic datasets
Toxic language detection: a systematic review of Arabic datasets
Imene Bensalem
Paolo Rosso
Hanane Zitouni
73
5
0
12 Dec 2023
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Hugo Laurenccon
Lucile Saulnier
Thomas Wang
Christopher Akiki
Albert Villanova del Moral
...
Violette Lepercq
Suzana Ilić
Margaret Mitchell
Sasha Luccioni
Yacine Jernite
AI4CEAILaw
75
169
0
07 Mar 2023
ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus
ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus
Ajinkya Kulkarni
Atharva Kulkarni
Sara Shatnawi
Hanan Aldarmaki
37
9
0
28 Feb 2023
In What Languages are Generative Language Models the Most Formal?
  Analyzing Formality Distribution across Languages
In What Languages are Generative Language Models the Most Formal? Analyzing Formality Distribution across Languages
Asim Ersoy
Gerson Vizcarra
T. Mayeesha
Benjamin Muller
67
2
0
23 Feb 2023
SAIDS: A Novel Approach for Sentiment Analysis Informed of Dialect and
  Sarcasm
SAIDS: A Novel Approach for Sentiment Analysis Informed of Dialect and Sarcasm
Abdelrahman Kaseb
Mona Farouk
26
9
0
06 Jan 2023
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
Samuel Cahyawijaya
Holy Lovenia
Alham Fikri Aji
Genta Indra Winata
Bryan Wilie
...
Timothy Baldwin
Sebastian Ruder
Herry Sujaini
S. Sakti
Ayu Purwarianti
125
50
0
19 Dec 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
453
2,398
0
09 Nov 2022
Maknuune: A Large Open Palestinian Arabic Lexicon
Maknuune: A Large Open Palestinian Arabic Lexicon
Shahd Dibas
Christian Khairallah
Nizar Habash
Omar Fayez Sadi
Tariq Sairafy
Karmel Sarabta
Abrar Ardah
CVBM
55
4
0
24 Oct 2022
Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets
Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets
Yousef Altaher
A. Fadel
Mazen Alotaibi
Mazen Alyazidi
Mishari Al-Mutairi
...
Mustafa Ghaleb
Nouamane Tazi
Raed Alharbi
Maraim Masoud
Zaid Alyafeai
75
10
0
01 Aug 2022
NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian
  Languages
NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian Languages
Samuel Cahyawijaya
Alham Fikri Aji
Holy Lovenia
Genta Indra Winata
Bryan Wilie
...
Fajri Koto
David Moeljadi
Karissa Vincentio
Ade Romadhony
Ayu Purwarianti
80
5
0
21 Jul 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented
  Languages and Dialects in Indonesia
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
104
106
0
24 Mar 2022
Documenting Geographically and Contextually Diverse Data Sources: The
  BigScience Catalogue of Language Data and Resources
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Angelina McMillan-Major
Zaid Alyafeai
Stella Biderman
Kimbo Chen
F. Toni
...
Aitor Soroa Etxabe
Pedro Ortiz Suarez
Zeerak Talat
Daniel Alexander van Strien
Yacine Jernite
83
14
0
25 Jan 2022
1