ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.08582
  4. Cited By
MASSIVE: A 1M-Example Multilingual Natural Language Understanding
  Dataset with 51 Typologically-Diverse Languages

MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages

18 April 2022
Jack G. M. FitzGerald
C. Hench
Charith Peris
Scott Mackie
Kay Rottmann
A. Sánchez
Aaron Nash
Liam Urbach
Vishesh Kakarala
Richa Singh
Swetha Ranganath
Laurie Crist
Misha Britan
Wouter Leeuwis
Gökhan Tür
Premkumar Natarajan
ArXivPDFHTML

Papers citing "MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages"

22 / 22 papers shown
Title
Survey of Abstract Meaning Representation: Then, Now, Future
Survey of Abstract Meaning Representation: Then, Now, Future
Behrooz Mansouri
3DV
143
0
0
06 May 2025
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Tiansheng Wen
Yifei Wang
Zequn Zeng
Zhong Peng
Yudi Su
Xinyang Liu
Bo Chen
Hongwei Liu
Stefanie Jegelka
Chenyu You
CLL
66
2
0
03 Mar 2025
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification
Y. Meena
Vaibhav Singh
Ayush Maheshwari
Amrith Krishna
Ganesh Ramakrishnan
AI4TS
97
0
0
09 Feb 2025
Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration
Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration
Thomas Walshe
S. Moon
Chunyang Xiao
Yawwani Gunawardana
Fran Silavong
39
0
0
21 Jan 2025
Text Clustering as Classification with LLMs
Text Clustering as Classification with LLMs
Chen Huang
Guoxiu He
36
2
0
03 Jan 2025
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Beomseok Lee
Ioan Calapodescu
Marco Gaido
Matteo Negri
Laurent Besacier
AuLLM
34
3
0
07 Aug 2024
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of
  Multilingual and Monolingual Text Embedding
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
K. Enevoldsen
Márton Kardos
Niklas Muennighoff
Kristoffer Laigaard Nielbo
37
9
0
04 Jun 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
M. Shoeybi
Bryan Catanzaro
Wei Ping
RALM
54
138
0
27 May 2024
k* Distribution: Evaluating the Latent Space of Deep Neural Networks
  using Local Neighborhood Analysis
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis
Shashank Kotyan
Tatsuya Ueda
Danilo Vasconcellos Vargas
27
1
0
07 Dec 2023
Primacy Effect of ChatGPT
Primacy Effect of ChatGPT
Yiwei Wang
Yujun Cai
Muhao Chen
Yuxuan Liang
Bryan Hooi
ALM
AI4MH
LRM
31
13
0
20 Oct 2023
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
R. S. Srinivasa
Jaejin Cho
Chouchang Yang
Yashas Malur Saidutta
Ching Hua Lee
Yilin Shen
Hongxia Jin
VLM
29
8
0
26 Sep 2023
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for
  Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Songbo Hu
Han Zhou
Mete Hergul
Milan Gritta
Guchun Zhang
Ignacio Iacobacci
Ivan Vulić
Anna Korhonen
28
10
0
26 Jul 2023
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual
  Pretrained Language Models
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
Peiqin Lin
Chengzhi Hu
Zheyu Zhang
André F. T. Martins
Hinrich Schütze
27
1
0
23 May 2023
Generalized Multiple Intent Conditioned Slot Filling
Generalized Multiple Intent Conditioned Slot Filling
Harshil Shah
Arthur Wilcke
Marius Cobzarenco
Cristian C Cobzarenco
Edward Challis
David Barber
11
0
0
18 May 2023
Measuring and Mitigating Local Instability in Deep Neural Networks
Measuring and Mitigating Local Instability in Deep Neural Networks
Arghya Datta
Subhrangshu Nandi
Jingcheng Xu
Greg Ver Steeg
He Xie
Anoop Kumar
Aram Galstyan
15
3
0
18 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language
  Understanding Aided by Speech Translation
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Mutian He
Philip N. Garner
36
4
0
16 May 2023
RETVec: Resilient and Efficient Text Vectorizer
RETVec: Resilient and Efficient Text Vectorizer
Elie Bursztein
Marina Zhang
Owen Vallis
Xinyu Jia
Alexey Kurakin
VLM
24
4
0
18 Feb 2023
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for
  Natural Language Understanding in Task-Oriented Dialogue
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue
Nikita Moghe
E. Razumovskaia
Liane Guillou
Ivan Vulić
Anna Korhonen
Alexandra Birch
32
13
0
20 Dec 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
101
2,306
0
09 Nov 2022
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of
  Downstream Tasks
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
Colin Leong
Joshua Nemecek
Jacob Mansdorfer
Anna Filighera
A. Owodunni
Daniel Whitenack
VLM
AI4CE
37
24
0
26 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
114
93
0
06 Oct 2022
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Olga Majewska
E. Razumovskaia
E. Ponti
Ivan Vulić
Anna Korhonen
30
28
0
31 Jan 2022
1