v1v2v3 (latest)

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

22 April 2024

Ahmed Hassan Awadallah

Jianmin Bao

Xin Jin

Yunsheng Li

Fan Yang

Jianwei Yang

Lu Yuan

Yue Zhang

ArXiv (abs)PDF HTML HuggingFace (257 upvotes)

Papers citing "Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone"

50 / 966 papers shown

Towards Effective Complementary Security Analysis using Large Language Models

Jonas Wagner

Simon Müller

Christian Näther

Jan-Philipp Steghöfer

Andreas Both

243

20 Jun 2025

A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development GoalsIEEE Access (IEEE Access), 2025

Diego Reforgiato Recupero

Angelo Salatino

Luca Secchi

196

18 Jun 2025

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

277

18 Jun 2025

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

327

18 Jun 2025

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim VerificationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

201

18 Jun 2025

Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size

Soufiane Hayou

Liyuan Liu

159

17 Jun 2025

PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning

...

203

17 Jun 2025

ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM

363

17 Jun 2025

Align-then-Unlearn: Embedding Alignment for LLM Unlearning

237

16 Jun 2025

SPOT: Bridging Natural Language and Geospatial Search for Investigative Journalists

136

16 Jun 2025

Rethinking Explainability in the Era of Multimodal AI

Chirag Agarwal

241

16 Jun 2025

PRISM2: Unlocking Multi-Modal General Pathology AI with Clinical Dialogue

...

271

16 Jun 2025

Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and CorrectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

219

16 Jun 2025

Assessing the Limits of In-Context Learning beyond Functions using Partially Ordered Relation

Debanjan Dutta

Faizanuddin Ansari

Swagatam Das

140

16 Jun 2025

MotiveBench: How Far Are We From Human-Like Motivational Reasoning in Large Language Models?Annual Meeting of the Association for Computational Linguistics (ACL), 2025

228

16 Jun 2025

Jailbreak Transferability Emerges from Shared Representations

Rico Angell

Jannik Brinkmann

He He

369

15 Jun 2025

ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies

172

15 Jun 2025

ConsistencyChecker: Tree-based Evaluation of LLM Generalization CapabilitiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Zhaochen Hong

Haofei Yu

Jiaxuan You

204

14 Jun 2025

OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics

358

14 Jun 2025

MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space

206

13 Jun 2025

Curriculum-Guided Layer Scaling for Language Model Pretraining

239

13 Jun 2025

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

...

472

12 Jun 2025

Burn After Reading: Do Multimodal Large Language Models Truly Capture Order of Events in Image Sequences?Annual Meeting of the Association for Computational Linguistics (ACL), 2025

299

12 Jun 2025

Discovering Hierarchical Latent Capabilities of Language Models via Causal Representation Learning

343

12 Jun 2025

Query-Level Uncertainty in Large Language Models

406

11 Jun 2025

Dataset of News Articles with Provenance Metadata for Media Relevance Assessment

Tomas Peterka

Matyas Bohacek

192

11 Jun 2025

Scaling Laws for Uncertainty in Deep Learning

242

11 Jun 2025

Do LLMs Give Psychometrically Plausible Responses in Educational Assessments?Workshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2025

254

11 Jun 2025

Evaluating LLMs Across Multi-Cognitive Levels: From Medical Knowledge Mastery to Scenario-Based Problem Solving

...

237

10 Jun 2025

Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search

Mirian Hipolito Garcia

207

10 Jun 2025

Beyond Bias Scores: Unmasking Vacuous Neutrality in Small Language Models

Sumanth Manduru

Carlotta Domeniconi

ALM

259

10 Jun 2025

WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-CodeAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

210

09 Jun 2025

Synthetic Visual GenomeComputer Vision and Pattern Recognition (CVPR), 2025

...

224

09 Jun 2025

A Neurosymbolic Agent System for Compositional Visual Reasoning

249

09 Jun 2025

Snap, Segment, Deploy: A Visual Data and Detection Pipeline for Wearable Industrial Assistants

183

09 Jun 2025

EgoM2P: Egocentric Multimodal Multitask Pretraining

424

09 Jun 2025

Can AI Validate Science? Benchmarking LLMs for Accurate Scientific Claim

\rightarrow

Evidence Reasoning

Shashidhar Reddy Javaji

191

09 Jun 2025

Instruction-Tuned Video-Audio Models Elucidate Functional Specialization in the Brain

R. Mamidi

Khushbu Pahwa

Prachi Jindal

Satya Sai Srinath Namburi

167

09 Jun 2025

Chain of Methodologies: Scaling Test Time Computation without TrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

221

08 Jun 2025

Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models

174

08 Jun 2025

Dual-Priv Pruning : Efficient Differential Private Fine-Tuning in Multimodal Large Language Models

...

164

08 Jun 2025

Vision-EKIPL: External Knowledge-Infused Policy Learning for Visual Reasoning

227

07 Jun 2025

Quantile Regression with Large Language Models for Price PredictionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

144

07 Jun 2025

VisioMath: Benchmarking Figure-based Mathematical Reasoning in LMMs

228

07 Jun 2025

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem ComplexityRobotics (RAS), 2025

310

242

07 Jun 2025

Movie Facts and Fibs (MF

^2

): A Benchmark for Long Movie Understanding

...

244

06 Jun 2025

Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models

282

06 Jun 2025

Structured Labeling Enables Faster Vision-Language Models for End-to-End Autonomous Driving

191

05 Jun 2025

A MISMATCHED Benchmark for Scientific Natural Language InferenceAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

194

05 Jun 2025

RedDebate: Safer Responses through Multi-Agent Red Teaming Debates

285

04 Jun 2025