KLEJ: Comprehensive Benchmark for Polish Language Understanding

Annual Meeting of the Association for Computational Linguistics (ACL), 2020

1 May 2020

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "KLEJ: Comprehensive Benchmark for Polish Language Understanding"

44 / 44 papers shown

Divide, Cache, Conquer: Dichotomic Prompting for Efficient Multi-Label LLM-Based Classification

112

05 Nov 2025

PL-Guard: Benchmarking Language Model Safety for Polish

Aleksandra Krasnodębska

Karolina Seweryn

Szymon Łukasik

Wojciech Kusa

134

19 Jun 2025

Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection

366

18 May 2025

Bielik 11B v2 Technical Report

456

05 May 2025

Bielik v3 Small: Technical Report

522

05 May 2025

Evaluating Polish linguistic and cultural competency in large language models

217

02 Mar 2025

Polish-ASTE: Aspect-Sentiment Triplet Extraction Datasets for PolishInternational Conference on Language Resources and Evaluation (LREC), 2025

380

27 Feb 2025

Behind Closed Words: Creating and Investigating the forePLay Annotated Dataset for Polish Erotic DiscourseAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

333

23 Dec 2024

Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation

329

24 Oct 2024

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

Shailaja Keyur Sampat

Yezhou Yang

MLLM CoGe ReLM VLM LRM

250

17 Oct 2024

Entity Insertion in Multilingual Linked Corpora: The Case of WikipediaConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

377

05 Oct 2024

WarCov -- Large multilabel and multimodal dataset from social platform

Weronika Borek-Marciniec

P. Zyblewski

Jakub Klikowski

Pawel Ksieniewicz

336

10 Jun 2024

PL-MTEB: Polish Massive Text Embedding Benchmark

Rafal Po'swiata

Slawomir Dadas

Michal Perelkiewicz

190

16 May 2024

Evaluation of Few-Shot Learning for Classification Tasks in the Polish Language

Tsimur Hadeliya

D. Kajtoch

280

27 Apr 2024

Efficient Language Adaptive Pre-training: Extending State-of-the-Art Large Language Models for Polish

Szymon Ruciñski

256

15 Feb 2024

Pipeline and Dataset Generation for Automated Fact-checking in Almost Any Language

405

15 Dec 2023

From Big to Small Without Losing It All: Text Augmentation with ChatGPT for Efficient Sentiment Analysis

Stanislaw Wo'zniak

Jan Kocoñ

279

07 Dec 2023

The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Chiyu Zhang

Khai Duy Doan

Qisheng Liao

Muhammad Abdul-Mageed

291

23 Oct 2023

BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web serviceInternational Conference on Language Resources and Evaluation (LREC), 2023

311

21 Aug 2023

Improving Domain-Specific Retrieval by NLI Fine-TuningConference on Computer Science and Information Systems (FedCSIS), 2023

253

06 Aug 2023

Electoral Agitation Data Set: The Use Case of the Polish Election

207

13 Jul 2023

Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification BenchmarkNeural Information Processing Systems (NeurIPS), 2023

243

13 Jun 2023

BEIR-PL: Zero Shot Information Retrieval Benchmark for the Polish LanguageInternational Conference on Language Resources and Evaluation (LREC), 2023

307

31 May 2023

GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark

...

Fei Huang

187

11 May 2023

MAUPQA: Massive Automatically-created Polish Question Answering DatasetWorkshop on Balto-Slavic Natural Language Processing (BSNLP), 2023

Piotr Rybak

252

09 May 2023

Going beyond research datasets: Novel intent discovery in the industry settingFindings (Findings), 2023

379

09 May 2023

ScandEval: A Benchmark for Scandinavian Natural Language ProcessingNordic Conference of Computational Linguistics (NODALIDA), 2023

Dan Saattrup Nielsen

ELM

284

03 Apr 2023

PolQA: Polish Question Answering DatasetInternational Conference on Language Resources and Evaluation (LREC), 2022

Piotr Rybak

Piotr Przybyła

M. Ogrodniczuk

317

17 Dec 2022

Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary ResourcesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

282

28 Nov 2022

This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for PolishNeural Information Processing Systems (NeurIPS), 2022

...

355

23 Nov 2022

State-of-the-art generalisation research in NLP: A taxonomy and reviewNature Machine Intelligence (Nat. Mach. Intell.), 2022

Verna Dankers

...

719

140

06 Oct 2022

Evaluation of Transfer Learning for Polish with a Text-to-Text ModelInternational Conference on Language Resources and Evaluation (LREC), 2022

266

18 May 2022

mGPT: Few-Shot Learners Go MultilingualTransactions of the Association for Computational Linguistics (TACL), 2022

Alena Fenogenova

478

200

15 Apr 2022

Assessment of Massively Multilingual Sentiment ClassifiersWorkshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2022

248

11 Apr 2022

Mukayese: Turkish NLP Strikes BackFindings (Findings), 2022

294

02 Mar 2022

Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

Alena Fenogenova

187

15 Feb 2022

Polish Natural Language Inference and Factivity -- an Expert-based Dataset and BenchmarksNatural Language Engineering (NLE), 2022

Daniel Ziembicki

Anna Wróblewska

Karolina Seweryn

206

10 Jan 2022

Detection of Criminal Texts for the Polish State Border Guard

Artur Nowakowski

K. Jassem

265

24 Aug 2021

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Katikapalli Subramanyam Kalyan

A. Rajasekharan

S. Sangeetha

VLM LM&MA

424

322

12 Aug 2021

EENLP: Cross-lingual Eastern European NLP IndexInternational Conference on Language Resources and Evaluation (LREC), 2021

Md. Sadek Hossain Asif

Matt Sárdi

306

05 Aug 2021

PyEuroVoc: A Tool for Multilingual Legal Document Classification with EuroVoc Descriptors

399

02 Aug 2021

HerBERT: Efficiently Pretrained Transformer-based Language Model for PolishWorkshop on Balto-Slavic Natural Language Processing (BSNLP), 2021

200

04 May 2021

Pre-training Polish Transformer-based Language Models at Scale

Slawomir Dadas

Michal Perelkiewicz

Rafal Poswiata

232

07 Jun 2020

Evaluation of Sentence Representations in PolishInternational Conference on Language Resources and Evaluation (LREC), 2019

Slawomir Dadas

Michal Perelkiewicz

Rafal Poswiata

485

25 Oct 2019