Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2005.00630
Cited By
KLEJ: Comprehensive Benchmark for Polish Language Understanding
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
1 May 2020
Piotr Rybak
Robert Mroczkowski
Janusz Tracz
Ireneusz Gawlik
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"KLEJ: Comprehensive Benchmark for Polish Language Understanding"
44 / 44 papers shown
Divide, Cache, Conquer: Dichotomic Prompting for Efficient Multi-Label LLM-Based Classification
Mikołaj Langner
Jan Eliasz
Ewa Rudnicka
Jan Kocoń
112
1
0
05 Nov 2025
PL-Guard: Benchmarking Language Model Safety for Polish
Aleksandra Krasnodębska
Karolina Seweryn
Szymon Łukasik
Wojciech Kusa
134
1
0
19 Jun 2025
Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection
Yuwei Zhang
Wenhao Yu
Shangbin Feng
Yifan Zhu
Letian Peng
Jayanth Srinivasa
Gaowen Liu
Jingbo Shang
KELM
366
6
0
18 May 2025
Bielik 11B v2 Technical Report
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
456
2
0
05 May 2025
Bielik v3 Small: Technical Report
Krzysztof Ociepa
Łukasz Flis
Remigiusz Kinas
Krzysztof Wróbel
Adrian Gwoździej
522
4
0
05 May 2025
Evaluating Polish linguistic and cultural competency in large language models
Sławomir Dadas
Małgorzata Grębowiec
Michał Perełkiewicz
Rafał Poświata
ELM
217
5
0
02 Mar 2025
Polish-ASTE: Aspect-Sentiment Triplet Extraction Datasets for Polish
International Conference on Language Resources and Evaluation (LREC), 2025
Marta Lango
Borys Naglik
Mateusz Lango
Iwo Naglik
380
3
0
27 Feb 2025
Behind Closed Words: Creating and Investigating the forePLay Annotated Dataset for Polish Erotic Discourse
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Anna Kołos
Katarzyna Lorenc
Emilia Wisnios
Agnieszka Karlinska
333
0
0
23 Dec 2024
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
329
14
0
24 Oct 2024
VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Shailaja Keyur Sampat
Mutsumi Nakamura
Shankar Kailas
Kartik Aggarwal
Mandy Zhou
Yezhou Yang
Chitta Baral
MLLM
CoGe
ReLM
VLM
LRM
250
1
0
17 Oct 2024
Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Tomás Feith
Akhil Arora
Martin Gerlach
Debjit Paul
Robert West
KELM
377
7
0
05 Oct 2024
WarCov -- Large multilabel and multimodal dataset from social platform
Weronika Borek-Marciniec
P. Zyblewski
Jakub Klikowski
Pawel Ksieniewicz
336
0
0
10 Jun 2024
PL-MTEB: Polish Massive Text Embedding Benchmark
Rafal Po'swiata
Slawomir Dadas
Michal Perelkiewicz
190
16
0
16 May 2024
Evaluation of Few-Shot Learning for Classification Tasks in the Polish Language
Tsimur Hadeliya
D. Kajtoch
280
2
0
27 Apr 2024
Efficient Language Adaptive Pre-training: Extending State-of-the-Art Large Language Models for Polish
Szymon Ruciñski
256
5
0
15 Feb 2024
Pipeline and Dataset Generation for Automated Fact-checking in Almost Any Language
Jan Drchal
Herbert Ullrich
Tomás Mlynár
Václav Moravec
HILM
405
5
0
15 Dec 2023
From Big to Small Without Losing It All: Text Augmentation with ChatGPT for Efficient Sentiment Analysis
Stanislaw Wo'zniak
Jan Kocoñ
279
16
0
07 Dec 2023
The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Chiyu Zhang
Khai Duy Doan
Qisheng Liao
Muhammad Abdul-Mageed
291
8
0
23 Oct 2023
BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
International Conference on Language Resources and Evaluation (LREC), 2023
Anna Kołos
Inez Okulska
Kinga Głąbińska
Agnieszka Karlinska
Emilia Wisnios
Paweł Ellerik
Andrzej Prałat
311
5
0
21 Aug 2023
Improving Domain-Specific Retrieval by NLI Fine-Tuning
Conference on Computer Science and Information Systems (FedCSIS), 2023
Roman Dusek
A. Wawer
Christopher Galias
Lidia Wojciechowska
253
2
0
06 Aug 2023
Electoral Agitation Data Set: The Use Case of the Polish Election
Mateusz Baran
Mateusz Wójcik
Piotr Kolebski
Michał Bernaczyk
Krzysztof Rajda
Lukasz Augustyniak
Tomasz Kajdanowicz
207
2
0
13 Jul 2023
Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark
Neural Information Processing Systems (NeurIPS), 2023
Lukasz Augustyniak
Szymon Wo'zniak
Marcin Gruza
Piotr Gramacki
Krzysztof Rajda
M. Morzy
Tomasz Kajdanowicz
243
15
0
13 Jun 2023
BEIR-PL: Zero Shot Information Retrieval Benchmark for the Polish Language
International Conference on Language Resources and Evaluation (LREC), 2023
Konrad Wojtasik
Vadim Shishkin
Kacper Wolowiec
Arkadiusz Janz
Maciej Piasecki
307
14
0
31 May 2023
GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark
Dongyang Li
Ruixue Ding
Qiang-Wei Zhang
Zheng Li
Boli Chen
...
Yao Xu
Xin Li
Ning Guo
Fei Huang
Xiaofeng He
ELM
VLM
187
8
0
11 May 2023
MAUPQA: Massive Automatically-created Polish Question Answering Dataset
Workshop on Balto-Slavic Natural Language Processing (BSNLP), 2023
Piotr Rybak
252
13
0
09 May 2023
Going beyond research datasets: Novel intent discovery in the industry setting
Findings (Findings), 2023
Aleksandra Chrabrowa
Tsimur Hadeliya
D. Kajtoch
Robert Mroczkowski
Piotr Rybak
379
2
0
09 May 2023
ScandEval: A Benchmark for Scandinavian Natural Language Processing
Nordic Conference of Computational Linguistics (NODALIDA), 2023
Dan Saattrup Nielsen
ELM
284
21
0
03 Apr 2023
PolQA: Polish Question Answering Dataset
International Conference on Language Resources and Evaluation (LREC), 2022
Piotr Rybak
Piotr Przybyła
M. Ogrodniczuk
317
8
0
17 Dec 2022
Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary Resources
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Xinyan Velocity Yu
Akari Asai
Trina Chatterjee
Junjie Hu
Eunsol Choi
282
31
0
28 Nov 2022
This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish
Neural Information Processing Systems (NeurIPS), 2022
Lukasz Augustyniak
Kamil Tagowski
Albert Sawczyn
Denis Janiak
Roman Bartusiak
...
Arkadiusz Janz
Piotr Szymañski
M. Morzy
Tomasz Kajdanowicz
Maciej Piasecki
355
14
0
23 Nov 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Nature Machine Intelligence (Nat. Mach. Intell.), 2022
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Robert Bamler
Zhijing Jin
719
140
0
06 Oct 2022
Evaluation of Transfer Learning for Polish with a Text-to-Text Model
International Conference on Language Resources and Evaluation (LREC), 2022
Aleksandra Chrabrowa
Lukasz Dragan
Karol Grzegorczyk
D. Kajtoch
Mikołaj Koszowski
Robert Mroczkowski
Piotr Rybak
266
24
0
18 May 2022
mGPT: Few-Shot Learners Go Multilingual
Transactions of the Association for Computational Linguistics (TACL), 2022
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
478
200
0
15 Apr 2022
Assessment of Massively Multilingual Sentiment Classifiers
Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2022
Krzysztof Rajda
Lukasz Augustyniak
Piotr Gramacki
Marcin Gruza
Szymon Wo'zniak
Tomasz Kajdanowicz
248
7
0
11 Apr 2022
Mukayese: Turkish NLP Strikes Back
Findings (Findings), 2022
Ali Safaya
Emirhan Kurtulucs
Arda Goktougan
Deniz Yuret
294
30
0
02 Mar 2022
Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Tatiana Shavrina
Anton A. Emelyanov
Denis Shevelev
Alexander Kukushkin
Valentin Malykh
Ekaterina Artemova
AAML
VLM
ELM
187
3
0
15 Feb 2022
Polish Natural Language Inference and Factivity -- an Expert-based Dataset and Benchmarks
Natural Language Engineering (NLE), 2022
Daniel Ziembicki
Anna Wróblewska
Karolina Seweryn
206
1
0
10 Jan 2022
Detection of Criminal Texts for the Polish State Border Guard
Artur Nowakowski
K. Jassem
265
1
0
24 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLM
LM&MA
424
322
0
12 Aug 2021
EENLP: Cross-lingual Eastern European NLP Index
International Conference on Language Resources and Evaluation (LREC), 2021
Alexey Tikhonov
Alex Malkhasov
A. Manoshin
George-Andrei Dima
Réka Cserháti
Md. Sadek Hossain Asif
Matt Sárdi
306
2
0
05 Aug 2021
PyEuroVoc: A Tool for Multilingual Legal Document Classification with EuroVoc Descriptors
Andrei-Marius Avram
V. Pais
D. Tufis
AILaw
VLM
399
20
0
02 Aug 2021
HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish
Workshop on Balto-Slavic Natural Language Processing (BSNLP), 2021
Robert Mroczkowski
Piotr Rybak
Alina Wróblewska
Ireneusz Gawlik
200
99
0
04 May 2021
Pre-training Polish Transformer-based Language Models at Scale
Slawomir Dadas
Michal Perelkiewicz
Rafal Poswiata
232
44
0
07 Jun 2020
Evaluation of Sentence Representations in Polish
International Conference on Language Resources and Evaluation (LREC), 2019
Slawomir Dadas
Michal Perelkiewicz
Rafal Poswiata
485
21
0
25 Oct 2019
1
Page 1 of 1