ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.00952
  4. Cited By
MediSwift: Efficient Sparse Pre-trained Biomedical Language Models

MediSwift: Efficient Sparse Pre-trained Biomedical Language Models

1 March 2024
Vithursan Thangarasa
Mahmoud Salem
Shreyas Saxena
Kevin Leong
Joel Hestness
Sean Lie
    MedIm
ArXivPDFHTML

Papers citing "MediSwift: Efficient Sparse Pre-trained Biomedical Language Models"

10 / 10 papers shown
Title
TorchSparse++: Efficient Training and Inference Framework for Sparse
  Convolution on GPUs
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs
Haotian Tang
Shang Yang
Zhijian Liu
Ke Hong
Zhongming Yu
Xiuyu Li
Guohao Dai
Yu Wang
Song Han
55
21
0
25 Oct 2023
A Study of Generative Large Language Model for Medical Research and
  Healthcare
A Study of Generative Large Language Model for Medical Research and Healthcare
C.A.I. Peng
Xi Yang
Aokun Chen
Kaleb E. Smith
Nima M. Pournejatian
...
W. Hogan
E. Shenkman
Yi Guo
Jiang Bian
Yonghui Wu
LM&MA
ELM
AI4MH
140
240
0
22 May 2023
PMC-LLaMA: Towards Building Open-source Language Models for Medicine
PMC-LLaMA: Towards Building Open-source Language Models for Medicine
Chaoyi Wu
Weixiong Lin
Xiaoman Zhang
Ya-Qin Zhang
Yanfeng Wang
Weidi Xie
LM&MA
AI4MH
86
75
0
27 Apr 2023
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,843
0
18 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
248
1,986
0
31 Dec 2020
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
148
345
0
23 Jul 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,453
0
23 Jan 2020
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,587
0
18 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,817
0
17 Sep 2019
PubMedQA: A Dataset for Biomedical Research Question Answering
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
205
807
0
13 Sep 2019
1