ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.17713
59
2

Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI Conversations

18 November 2024
Igor Fedorov
Kate Plawiak
Lemeng Wu
Tarek Elgamal
Naveen Suda
Eric Michael Smith
Hongyuan Zhan
Jianfeng Chi
Yuriy Hulovatyy
Kimish Patel
Zechun Liu
Changsheng Zhao
Yangyang Shi
Tijmen Blankevoort
Mahesh Pasupuleti
Bilge Soran
Zacharie Delpierre Coudert
Rachad Alao
Raghuraman Krishnamoorthi
Vikas Chandra
ArXivPDFHTML
Abstract

This paper presents Llama Guard 3-1B-INT4, a compact and efficient Llama Guard model, which has been open-sourced to the community during Meta Connect 2024. We demonstrate that Llama Guard 3-1B-INT4 can be deployed on resource-constrained devices, achieving a throughput of at least 30 tokens per second and a time-to-first-token of 2.5 seconds or less on a commodity Android mobile CPU. Notably, our experiments show that Llama Guard 3-1B-INT4 attains comparable or superior safety moderation scores to its larger counterpart, Llama Guard 3-1B, despite being approximately 7 times smaller in size (440MB).

View on arXiv
Comments on this paper