Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2404.14219
Cited By

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone

v1v2v3 (latest)

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

22 April 2024

Ahmed Hassan Awadallah

Arash Bakhtiari

Jianmin Bao

Harkirat Singh Behl

Sébastien Bubeck

C. C. T. Mendes

Vishrav Chaudhary

Allison Del Giorno

Gustavo de Rosa

Abhishek Goswami

Suriya Gunasekar

Russell J. Hewett

Mojan Javaheripi

Xin Jin

Piero Kauffmann

Nikos Karampatziakis

Yunsheng Li

Daniel Perez-Becker

Olatunji Ruwase

Michael Santacroce

Swadheen Shukla

Masahiro Tanaka

Philipp A. Witte

Fan Yang

Jianwei Yang

Lu Yuan

Cheng-Yuan Zhang

Yue Zhang

ArXiv (abs)PDF HTML HuggingFace (257 upvotes)

Papers citing "Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone"

50 / 966 papers shown

LoRALib: A Standardized Benchmark for Evaluating LoRA-MoE Methods

LoRALib: A Standardized Benchmark for Evaluating LoRA-MoE Methods

154

0

0

14 Sep 2025

LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences

LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences

Christopher G. Brinton

Sabine Brunswicker

166

2

0

14 Sep 2025

Continually Adding New Languages to Multilingual Language Models

Continually Adding New Languages to Multilingual Language Models

206

2

0

14 Sep 2025

Enhancing Generalization in Vision-Language-Action Models by Preserving Pretrained Representations

Enhancing Generalization in Vision-Language-Action Models by Preserving Pretrained Representations

Akshay Gopalkrishnan

Henrik I. Christensen

225

4

0

14 Sep 2025

TrueSkin: Towards Fair and Accurate Skin Tone Recognition and Generation

TrueSkin: Towards Fair and Accurate Skin Tone Recognition and Generation

127

1

0

13 Sep 2025

RefactorCoderQA: Benchmarking LLMs for Multi-Domain Coding Question Solutions in Cloud and Edge Deployment

RefactorCoderQA: Benchmarking LLMs for Multi-Domain Coding Question Solutions in Cloud and Edge Deployment

Shadikur Rahman

Gautam Srivastava

Syed Muhammad Danish

191

1

0

12 Sep 2025

GrACE: A Generative Approach to Better Confidence Elicitation in Large Language Models

GrACE: A Generative Approach to Better Confidence Elicitation in Large Language Models

162

2

0

11 Sep 2025

Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis

Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis

190

5

0

11 Sep 2025

Competitive Audio-Language Models with Data-Efficient Single-Stage Training on Public Data

Competitive Audio-Language Models with Data-Efficient Single-Stage Training on Public Data

Gokul Karthik Kumar

Ludovick Lepauloux

Billel Mokeddem

153

1

0

09 Sep 2025

Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes

Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes

Mohammad Akbari

208

17

0

08 Sep 2025

HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring

HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring

Michael J. Witbrock

385

1

0

08 Sep 2025

MoGU V2: Toward a Higher Pareto Frontier Between Model Usability and Security

MoGU V2: Toward a Higher Pareto Frontier Between Model Usability and Security

121

0

0

08 Sep 2025

MedBench-IT: A Comprehensive Benchmark for Evaluating Large Language Models on Italian Medical Entrance Examinations

MedBench-IT: A Comprehensive Benchmark for Evaluating Large Language Models on Italian Medical Entrance Examinations

Ruggero Marino Lazzaroni

Alessandro Angioi

Michelangelo Puliga

149

1

0

08 Sep 2025

Self-Aligned Reward: Towards Effective and Efficient Reasoners

Self-Aligned Reward: Towards Effective and Efficient Reasoners

Gerald Friedland

162

1

0

05 Sep 2025

WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning

WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning

140

4

0

05 Sep 2025

Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data

Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data

Shrikant B. Kendre

Silvio Savarese

Juan Carlos Niebles

130

1

0

03 Sep 2025

Implicit Reasoning in Large Language Models: A Comprehensive Survey

Implicit Reasoning in Large Language Models: A Comprehensive Survey

OffRL LRM AI4CE

234

14

0

02 Sep 2025

Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

Erfan Baghaei Potraghloo

Seyedarmin Azizi

94

4

0

02 Sep 2025

Unlearning That Lasts: Utility-Preserving, Robust, and Almost Irreversible Forgetting in LLMs

Unlearning That Lasts: Utility-Preserving, Robust, and Almost Irreversible Forgetting in LLMs

Maximilian Müller

Francesco Croce

201

4

0

02 Sep 2025

DaMoC: Efficiently Selecting the Optimal Large Language Model for Fine-tuning Domain Tasks Based on Data and Model Compression

DaMoC: Efficiently Selecting the Optimal Large Language Model for Fine-tuning Domain Tasks Based on Data and Model Compression

224

0

0

01 Sep 2025

Kwai Keye-VL 1.5 Technical Report

Kwai Keye-VL 1.5 Technical Report

...

333

17

0

01 Sep 2025

Improving Large Vision and Language Models by Learning from a Panel of Peers

Improving Large Vision and Language Models by Learning from a Panel of Peers

Vicente Ordonez

139

1

0

01 Sep 2025

VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding

VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding

125

1

0

30 Aug 2025

DriveQA: Passing the Driving Knowledge Test

DriveQA: Passing the Driving Knowledge Test

135

1

0

29 Aug 2025

Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models

Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models

99

0

0

29 Aug 2025

Leveraging Large Language Models for Generating Research Topic Ontologies: A Multi-Disciplinary Study

Leveraging Large Language Models for Generating Research Topic Ontologies: A Multi-Disciplinary Study

Angelo Salatino

Francesco Osborne

97

0

0

28 Aug 2025

MindGuard: Intrinsic Decision Inspection for Securing LLM Agents Against Metadata Poisoning

MindGuard: Intrinsic Decision Inspection for Securing LLM Agents Against Metadata Poisoning

165

0

0

28 Aug 2025

NLKI: A lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commonsense VQA Tasks

NLKI: A lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commonsense VQA Tasks

Swapnanil Mukherjee

Deepanway Ghosal

105

0

0

27 Aug 2025

Ensemble Debates with Local Large Language Models for AI Alignment

Ensemble Debates with Local Large Language Models for AI Alignment

Ephraiem Sarabamoun

360

0

0

27 Aug 2025

KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts

KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts

161

1

0

27 Aug 2025

Scalable Object Detection in the Car Interior With Vision Foundation Models

Scalable Object Detection in the Car Interior With Vision Foundation Models

Bálint Mészáros

Ahmet Firintepe

Sebastian Schmidt

Stephan Günnemann

97

0

0

27 Aug 2025

Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive Learning

Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive LearningInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025

131

0

0

26 Aug 2025

Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models

Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models

84

1

0

26 Aug 2025

PKG-DPO: Optimizing Domain-Specific AI systems with Physics Knowledge Graphs and Direct Preference Optimization

PKG-DPO: Optimizing Domain-Specific AI systems with Physics Knowledge Graphs and Direct Preference Optimization

Nitin Nagesh Kulkarni

56

0

0

25 Aug 2025

From Global to Local: Social Bias Transfer in CLIP

From Global to Local: Social Bias Transfer in CLIP

122

0

0

25 Aug 2025

MoE-Inference-Bench: Performance Evaluation of Mixture of Expert Large Language and Vision Models

MoE-Inference-Bench: Performance Evaluation of Mixture of Expert Large Language and Vision Models

Krishna Teja Chitty-Venkata

Natalia Vassilieva

Siddhisanket Raskar

121

1

0

24 Aug 2025

TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling

TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling

228

4

0

22 Aug 2025

WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation

WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation

...

Juan A. Rodriguez

Perouz Taslakian

128

10

0

22 Aug 2025

Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities

Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities

Aseem Srivastava

Md. Shad Akhtar

90

0

0

22 Aug 2025

Dynamic Sparse Attention on Mobile SoCs

Dynamic Sparse Attention on Mobile SoCs

177

3

0

22 Aug 2025

RoboBuddy in the Classroom: Exploring LLM-Powered Social Robots for Storytelling in Learning and Integration Activities

RoboBuddy in the Classroom: Exploring LLM-Powered Social Robots for Storytelling in Learning and Integration Activities

Daniel Tozadore

Mortadha Abderrahim

57

0

0

22 Aug 2025

Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation

Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation

...

164

1

0

21 Aug 2025

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Rabeeh Karimi Mahabadi

Shrimai Prabhumoye

Mohammad Shoeybi

Bryan Catanzaro

156

9

0

20 Aug 2025

Evaluating Open-Source Vision Language Models for Facial Emotion Recognition against Traditional Deep Learning Models

Evaluating Open-Source Vision Language Models for Facial Emotion Recognition against Traditional Deep Learning Models

Vamsi Krishna Mulukutla

Sai Supriya Pavarala

Srinivasa Raju Rudraraju

82

0

0

19 Aug 2025

The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget

The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget

114

5

0

19 Aug 2025

Prompt Orchestration Markup Language

131

2

0

19 Aug 2025

Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants

Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants

Alessio Galatolo

Luca Alberto Rappuoli

Meriem Beloucif

149

2

0

18 Aug 2025

Is GPT-OSS Good? A Comprehensive Evaluation of OpenAI's Latest Open Source Models

Is GPT-OSS Good? A Comprehensive Evaluation of OpenAI's Latest Open Source Models

Chiung-Yi Tseng

...

Junhao Song

230

5

0

17 Aug 2025

Rethinking Safety in LLM Fine-tuning: An Optimization Perspective

Rethinking Safety in LLM Fine-tuning: An Optimization Perspective

David M. Krueger

147

4

0

17 Aug 2025

VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

184

1

0

16 Aug 2025

1 2 3 4 5 6...18 19 20

Page 3 of 20

Pageof 20