ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.00185
21
8

Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks

1 October 2022
Zhenhailong Wang
Xiaoman Pan
Dian Yu
Dong Yu
Jianshu Chen
Heng Ji
    VLM
ArXivPDFHTML
Abstract

Although large language models have achieved impressive zero-shot ability, the huge model size generally incurs high cost. Recently, semi-parametric language models, which augment a smaller language model with an external retriever, have demonstrated promising language modeling capabilities. However, it remains unclear whether such semi-parametric language models can perform competitively well as their fully-parametric counterparts on zero-shot generalization to downstream tasks. In this work, we introduce Zemi\text{Zemi}Zemi, a zero-shot semi-parametric language model. To our best knowledge, this is the first semi-parametric language model that can demonstrate strong zero-shot performance on a wide range of held-out unseen tasks. We train Zemi\text{Zemi}Zemi with a novel semi-parametric multitask prompted training paradigm, which shows significant improvement compared with the parametric multitask training as proposed by T0. Specifically, we augment the multitask training and zero-shot evaluation with retrieval from a large-scale task-agnostic unlabeled corpus. In order to incorporate multiple potentially noisy retrieved augmentations, we further propose a novel augmentation fusion\text{augmentation fusion}augmentation fusion module leveraging perceiver resampler and gated cross-attention. Notably, our proposed ZemiLARGE\text{Zemi}_\text{LARGE}ZemiLARGE​ outperforms T0-3B by 16% on all seven evaluation tasks while being 3.9x smaller in model size.

View on arXiv
Comments on this paper