ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.09648
  4. Cited By
NusaCrowd: Open Source Initiative for Indonesian NLP Resources

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

19 December 2022
Samuel Cahyawijaya
Holy Lovenia
Alham Fikri Aji
Genta Indra Winata
Bryan Wilie
Rahmad Mahendra
C. Wibisono
Ade Romadhony
Karissa Vincentio
Fajri Koto
Jennifer Santoso
David Moeljadi
Cahya Wirawan
Frederikus Hudi
Ivan Halim Parmonangan
Ika Alfina
Muhammad Satrio Wicaksono
Ilham Firdausi Putra
Samsul Rahmadani
Yulianti Oenang
Ali Akbar Septiandri
James Jaya
Kaustubh D. Dhole
Arie A. Suryani
Rifki Afina Putri
Dan Su
K. Stevens
Made Nindyatama Nityasya
Muhammad Farid Adilazuarda
Ryan Ignatius
Ryandito Diandaru
Tiezheng Yu
Vito Ghifari
Wenliang Dai
Yan Xu
Dyah Damapuspita
C. Tho
I. M. K. Karo
Tirana Noor Fatyanosa
Ziwei Ji
Pascale Fung
Graham Neubig
Timothy Baldwin
Sebastian Ruder
Herry Sujaini
S. Sakti
Ayu Purwarianti
ArXivPDFHTML

Papers citing "NusaCrowd: Open Source Initiative for Indonesian NLP Resources"

15 / 15 papers shown
Title
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Fajri Koto
ELM
33
2
0
13 Sep 2024
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
28
15
0
09 Oct 2023
Language Models are Multilingual Chain-of-Thought Reasoners
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLM
LRM
162
320
0
06 Oct 2022
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Olga Majewska
E. Razumovskaia
E. Ponti
Ivan Vulić
Anna Korhonen
30
28
0
31 Jan 2022
NL-Augmenter: A Framework for Task-Sensitive Natural Language
  Augmentation
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
Jing Zhang
Yue Zhang
153
86
0
06 Dec 2021
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Zaid Alyafeai
Maraim Masoud
Mustafa Ghaleb
Maged S. Al-Shaibani
31
21
0
13 Oct 2021
Visually Grounded Reasoning across Languages and Cultures
Visually Grounded Reasoning across Languages and Cultures
Fangyu Liu
Emanuele Bugliarello
E. Ponti
Siva Reddy
Nigel Collier
Desmond Elliott
VLM
LRM
92
167
0
28 Sep 2021
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with
  Effective Domain-Specific Vocabulary Initialization
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization
Fajri Koto
Jey Han Lau
Timothy Baldwin
VLM
52
82
0
10 Sep 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
845
0
17 Feb 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
238
254
0
02 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
138
183
0
31 Dec 2020
PhoBERT: Pre-trained language models for Vietnamese
PhoBERT: Pre-trained language models for Vietnamese
Dat Quoc Nguyen
A. Nguyen
157
342
0
02 Mar 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
214
505
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1