ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.10544
11
13

Model Leeching: An Extraction Attack Targeting LLMs

19 September 2023
Lewis Birch
William Hackett
Stefan Trawicki
N. Suri
Peter Garraghan
ArXivPDFHTML
Abstract

Model Leeching is a novel extraction attack targeting Large Language Models (LLMs), capable of distilling task-specific knowledge from a target LLM into a reduced parameter model. We demonstrate the effectiveness of our attack by extracting task capability from ChatGPT-3.5-Turbo, achieving 73% Exact Match (EM) similarity, and SQuAD EM and F1 accuracy scores of 75% and 87%, respectively for only 50inAPIcost.WefurtherdemonstratethefeasibilityofadversarialattacktransferabilityfromanextractedmodelextractedviaModelLeechingtoperformMLattackstagingagainstatargetLLM,resultinginan1150 in API cost. We further demonstrate the feasibility of adversarial attack transferability from an extracted model extracted via Model Leeching to perform ML attack staging against a target LLM, resulting in an 11% increase to attack success rate when applied to ChatGPT-3.5-Turbo.50inAPIcost.WefurtherdemonstratethefeasibilityofadversarialattacktransferabilityfromanextractedmodelextractedviaModelLeechingtoperformMLattackstagingagainstatargetLLM,resultinginan11

View on arXiv
Comments on this paper