ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.12311
  4. Cited By
i-Code V2: An Autoregressive Generation Framework over Vision, Language,
  and Speech Data

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

21 May 2023
Ziyi Yang
Mahmoud Khademi
Yichong Xu
Reid Pryzant
Yuwei Fang
Chenguang Zhu
Dongdong Chen
Yao Qian
Mei Gao
Yi-Ling Chen
R. Gmyr
Naoyuki Kanda
Noel Codella
Bin Xiao
Yu Shi
Lu Yuan
Takuya Yoshioka
Michael Zeng
Xuedong Huang
ArXivPDFHTML

Papers citing "i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data"

11 / 11 papers shown
Title
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Xiaofei Wang
Manthan Thakker
Zhuo Chen
Naoyuki Kanda
Sefik Emre Eskimez
Sanyuan Chen
M. Tang
Shujie Liu
Jinyu Li
Takuya Yoshioka
6
79
0
14 Aug 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Changhan Wang
Wei-Ning Hsu
Yossi Adi
Adam Polyak
Ann Lee
Peng-Jen Chen
Jiatao Gu
J. Pino
VLM
67
32
0
14 Sep 2021
UniMS: A Unified Framework for Multimodal Summarization with Knowledge
  Distillation
UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation
Zhengkun Zhang
Xiaojun Meng
Yasheng Wang
Xin Jiang
Qun Liu
Zhenglu Yang
36
32
0
13 Sep 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
231
573
0
22 Apr 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
518
0
04 Feb 2021
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models
Xueliang Zhao
Wei Yu Wu
Can Xu
Chongyang Tao
Dongyan Zhao
Rui Yan
169
181
0
17 Oct 2020
Text Summarization with Pretrained Encoders
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
245
1,417
0
22 Aug 2019
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,435
0
26 Sep 2016
1