ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14938
  4. Cited By
Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large
  Language Models with SocKET Benchmark
v1v2 (latest)

Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
24 May 2023
Minje Choi
Jiaxin Pei
Sagar Kumar
Chang Shu
David Jurgens
    ALMLLMAG
ArXiv (abs)PDFHTMLGithub (23★)

Papers citing "Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark"

49 / 49 papers shown
Title
Spot The Ball: A Benchmark for Visual Social Inference
Spot The Ball: A Benchmark for Visual Social Inference
Neha Balamurugan
Sarah Wu
Adam Chun
Gabe Gaw
Cristobal Eyzaguirre
Tobias Gerstenberg
LRM
115
0
0
31 Oct 2025
From Polyester Girlfriends to Blind Mice: Creating the First Pragmatics Understanding Benchmarks for Slovene
From Polyester Girlfriends to Blind Mice: Creating the First Pragmatics Understanding Benchmarks for Slovene
Mojca Brglez
Špela Vintar
73
0
0
24 Oct 2025
Social World Models
Social World Models
Xuhui Zhou
Jiarui Liu
Akhila Yerukola
Hyunwoo J. Kim
Maarten Sap
92
0
0
30 Aug 2025
Assessing how hyperparameters impact Large Language Models' sarcasm detection performance
Assessing how hyperparameters impact Large Language Models' sarcasm detection performance
Montgomery Gole
Andriy Miranskyy
AI4MH
193
0
0
08 Apr 2025
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
Erika Mori
Yue Qiu
Hirokatsu Kataoka
Y. Aoki
252
0
0
27 Mar 2025
Socially Constructed Treatment Plans: Analyzing Online Peer Interactions to Understand How Patients Navigate Complex Medical Conditions
Socially Constructed Treatment Plans: Analyzing Online Peer Interactions to Understand How Patients Navigate Complex Medical Conditions
Madhusudan Basak
Omar Sharif
Jessica Hulsey
Elizabeth C. Saunders
Daisy J. Goodman
Luke J. ArchiBald
S. Preum
79
0
0
27 Mar 2025
RedditESS: A Mental Health Social Support Interaction Dataset -- Understanding Effective Social Support to Refine AI-Driven Support Tools
RedditESS: A Mental Health Social Support Interaction Dataset -- Understanding Effective Social Support to Refine AI-Driven Support Tools
Zeyad Alghamdi
Tharindu Kumarage
Garima Agrawal
Mansooreh Karami
Ibrahim Almuteb
Huan Liu
AI4MH
204
1
0
27 Mar 2025
The Call for Socially Aware Language Technologies
The Call for Socially Aware Language Technologies
Diyi Yang
Dirk Hovy
David Jurgens
Barbara Plank
VLM
346
14
0
24 Feb 2025
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias DetectionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Maximilian Spliethover
Tim Knebler
Fabian Fumagalli
Maximilian Muschalik
Barbara Hammer
Eyke Hüllermeier
Henning Wachsmuth
329
1
0
10 Feb 2025
SPRIG: Improving Large Language Model Performance by System Prompt
  Optimization
SPRIG: Improving Large Language Model Performance by System Prompt Optimization
Lechen Zhang
Tolga Ergen
Lajanugen Logeswaran
Moontae Lee
David Jurgens
LRM
316
29
0
18 Oct 2024
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language
  Models Through Traversing 2D Game Maps
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game MapsNeural Information Processing Systems (NeurIPS), 2024
Muhammad Umair Nasir
Steven D. James
Julian Togelius
ELMLRM
153
9
0
10 Oct 2024
Knowledge Planning in Large Language Models for Domain-Aligned
  Counseling Summarization
Knowledge Planning in Large Language Models for Domain-Aligned Counseling SummarizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Aseem Srivastava
Smriti Joshi
Tanmoy Chakraborty
Md. Shad Akhtar
123
9
0
23 Sep 2024
Prompt Refinement or Fine-tuning? Best Practices for using LLMs in
  Computational Social Science Tasks
Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks
Anders Giovanni Moller
L. Aiello
LLMAG
101
5
0
02 Aug 2024
Stress-Testing Long-Context Language Models with Lifelong ICL and Task
  Haystack
Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack
Xiaoyue Xu
Qinyuan Ye
Xiang Ren
279
15
0
23 Jul 2024
CLEAR: Can Language Models Really Understand Causal Graphs?
CLEAR: Can Language Models Really Understand Causal Graphs?
Sirui Chen
Mengying Xu
Kun Wang
Xingyu Zeng
Rui Zhao
Shengjie Zhao
Chaochao Lu
LRMELM
219
13
0
24 Jun 2024
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation
  of Non-Literal Intent Resolution in LLMs
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Akhila Yerukola
Saujas Vaduguru
Daniel Fried
Maarten Sap
211
1
0
14 May 2024
Akal Badi ya Bias: An Exploratory Study of Gender Bias in Hindi Language
  Technology
Akal Badi ya Bias: An Exploratory Study of Gender Bias in Hindi Language TechnologyConference on Fairness, Accountability and Transparency (FAccT), 2024
Rishav Hada
Safiya Husain
Varun Gumma
Harshita Diddee
Aditya Yadavalli
...
Nidhi Kulkarni
U. Gadiraju
Aditya Vashistha
Vivek Seshadri
Kalika Bali
253
14
0
10 May 2024
Can large language models understand uncommon meanings of common words?
Can large language models understand uncommon meanings of common words?
Jinyang Wu
Feihu Che
Xinxin Zheng
Shuai Zhang
Ruihan Jin
Shuai Nie
Pengpeng Shao
Jianhua Tao
156
5
0
09 May 2024
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Yeqi Gao
Yuzhou Gu
Zhao Song
356
1
0
09 May 2024
Language Evolution for Evading Social Media Regulation via LLM-based
  Multi-agent Simulation
Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent SimulationIEEE Congress on Evolutionary Computation (CEC), 2024
Jinyu Cai
Jialong Li
Mingyue Zhang
Munan Li
Chen-Shu Wang
Kenji Tei
LLMAG
179
9
0
05 May 2024
Modeling Empathetic Alignment in Conversation
Modeling Empathetic Alignment in Conversation
Jiamin Yang
David Jurgens
155
0
0
02 May 2024
"A good pun is its own reword": Can Large Language Models Understand
  Puns?
"A good pun is its own reword": Can Large Language Models Understand Puns?
Zhijun Xu
Siyu Yuan
Lingjie Chen
Deqing Yang
LRM
227
21
0
21 Apr 2024
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A
  Multifaceted Statistical Approach
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A Multifaceted Statistical Approach
Kun Sun
Rong Wang
Anders Sogaard
240
6
0
22 Mar 2024
Academically intelligent LLMs are not necessarily socially intelligent
Academically intelligent LLMs are not necessarily socially intelligent
Ruoxi Xu
Hongyu Lin
Xianpei Han
Le Sun
Yingfei Sun
ELM
148
11
0
11 Mar 2024
In-Memory Learning: A Declarative Learning Framework for Large Language
  Models
In-Memory Learning: A Declarative Learning Framework for Large Language Models
Bo Wang
Tianxiang Sun
Hang Yan
Siyin Wang
Qingyuan Cheng
Xipeng Qiu
LLMAG
127
1
0
05 Mar 2024
MIKO: Multimodal Intention Knowledge Distillation from Large Language
  Models for Social-Media Commonsense Discovery
MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery
Feihong Lu
Weiqi Wang
Yangyifei Luo
Ziqin Zhu
Qingyun Sun
...
Haochen Shi
Shiqi Gao
Qian Li
Yangqiu Song
Jianxin Li
VLM
328
7
0
28 Feb 2024
Social Intelligence Data Infrastructure: Structuring the Present and
  Navigating the Future
Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future
Minzhi Li
Weiyan Shi
Caleb Ziems
Diyi Yang
230
11
0
28 Feb 2024
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media
  Platforms
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms
Yiqiao Jin
Minje Choi
Gaurav Verma
Yongfeng Zhang
Srijan Kumar
254
27
0
21 Feb 2024
TreeEval: Benchmark-Free Evaluation of Large Language Models through
  Tree Planning
TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning
Xiang Li
Yunshi Lan
Chao Yang
ELM
103
13
0
20 Feb 2024
SoMeLVLM: A Large Vision Language Model for Social Media Processing
SoMeLVLM: A Large Vision Language Model for Social Media Processing
Xinnong Zhang
Haoyu Kuang
Xinyi Mou
Hanjia Lyu
Kun Wu
Siming Chen
Jiebo Luo
Xuanjing Huang
Zhongyu Wei
MLLM
181
12
0
20 Feb 2024
Polarization of Autonomous Generative AI Agents Under Echo Chambers
Polarization of Autonomous Generative AI Agents Under Echo Chambers
Masaya Ohagi
LLMAG
135
9
0
19 Feb 2024
Decoding News Narratives: A Critical Analysis of Large Language Models
  in Framing Detection
Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection
Valeria Pastorino
Jasivan Sivakumar
N. Moosavi
254
4
0
18 Feb 2024
SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks
SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks
Gourab Dey
Adithya Ganesan
Yash Kumar Lal
Manal Shah
Shreyashee Sinha
Matthew Matero
Salvatore Giorgi
Vivek Kulkarni
H. Andrew Schwartz
ALM
239
10
0
03 Feb 2024
Comparing Pre-trained Human Language Models: Is it Better with Human
  Context as Groups, Individual Traits, or Both?
Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2024
Nikita Soni
Niranjan Balasubramanian
H. Andrew Schwartz
Dirk Hovy
286
5
0
23 Jan 2024
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the
  Believability of Human Behavior Simulation
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation
Yang Xiao
Yi Cheng
Jinlan Fu
Jiashuo Wang
Wenjie Li
Pengfei Liu
LLMAG
262
6
0
28 Dec 2023
On Sarcasm Detection with OpenAI GPT-based Models
On Sarcasm Detection with OpenAI GPT-based Models
Montgomery Gole
Williams-Paul Nwadiugwu
Andriy Miranskyy
92
14
0
07 Dec 2023
FFT: Towards Harmlessness Evaluation and Analysis for LLMs with
  Factuality, Fairness, Toxicity
FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity
Shiyao Cui
Zhenyu Zhang
Yilong Chen
Wenyuan Zhang
Tianyun Liu
Siqi Wang
Tingwen Liu
202
20
0
30 Nov 2023
You don't need a personality test to know these models are unreliable:
  Assessing the Reliability of Large Language Models on Psychometric
  Instruments
You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments
Bangzhao Shu
Lechen Zhang
Minje Choi
Lavinia Dunagan
Lajanugen Logeswaran
Moontae Lee
Dallas Card
David Jurgens
214
61
0
16 Nov 2023
Large Human Language Models: A Need and the Challenges
Large Human Language Models: A Need and the Challenges
Nikita Soni
H. Andrew Schwartz
João Sedoc
Niranjan Balasubramanian
ALMAI4CE
203
14
0
09 Nov 2023
DialogBench: Evaluating LLMs as Human-like Dialogue Systems
DialogBench: Evaluating LLMs as Human-like Dialogue SystemsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Jiao Ou
Junda Lu
Che Liu
Yihong Tang
Fuzheng Zhang
Chen Zhang
Kun Gai
ALMLM&MA
243
27
0
03 Nov 2023
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning
HARE: Explainable Hate Speech Detection with Step-by-Step ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yongjin Yang
Joonkee Kim
Yujin Kim
Namgyu Ho
James Thorne
Se-Young Yun
266
44
0
01 Nov 2023
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
EvalCrafter: Benchmarking and Evaluating Large Video Generation ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Yaofang Liu
Xiaodong Cun
Xuebo Liu
Xintao Wang
Yong Zhang
Haoxin Chen
Yang Liu
Tieyong Zeng
Raymond H. F. Chan
Ying Shan
VGenEGVM
282
220
0
17 Oct 2023
Welfare Diplomacy: Benchmarking Language Model Cooperation
Welfare Diplomacy: Benchmarking Language Model Cooperation
Gabriel Mukobi
Hannah Erlebach
Niklas Lauffer
Lewis Hammond
Alan Chan
Jesse Clifton
LM&Ro
266
39
0
13 Oct 2023
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
DyVal: Dynamic Evaluation of Large Language Models for Reasoning TasksInternational Conference on Learning Representations (ICLR), 2023
A. Maritan
Jiaao Chen
S. Dey
Luca Schenato
Diyi Yang
Xing Xie
ELMLRM
379
78
0
29 Sep 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM
  Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao Song
Weixin Wang
Junze Yin
250
30
0
14 Sep 2023
A Survey on Large Language Model based Autonomous Agents
A Survey on Large Language Model based Autonomous Agents
Lei Wang
Chengbang Ma
Xueyang Feng
Zeyu Zhang
Hao-ran Yang
...
Xu Chen
Yankai Lin
Wayne Xin Zhao
Zhewei Wei
Ji-Rong Wen
LLMAGAI4CELM&Ro
638
2,011
0
22 Aug 2023
A Survey on Evaluation of Large Language Models
A Survey on Evaluation of Large Language ModelsACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Yu-Chu Chang
Xu Wang
Yongfeng Zhang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELMLM&MAALM
700
2,655
0
06 Jul 2023
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in
  Classification Tasks
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification TasksConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Anders Giovanni Møller
Jacob Aarup Dalsgaard
Arianna Pera
L. Aiello
233
57
0
26 Apr 2023
Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years
  of German Parliamentary Debates
Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary DebatesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Aida Kostikova
Benjamin Paassen
Dominik Beese
Ole Putz
Gregor Wiedemann
Steffen Eger
159
6
0
09 Oct 2022
1