Release of Pre-Trained Models for the Japanese Language

Release of Pre-Trained Models for the Japanese Language

2 April 2024

Toshiaki Wakatsuki

Papers citing "Release of Pre-Trained Models for the Japanese Language"

13 / 13 papers shown

Title
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions E. Liu Amanda Bertsch Lintang Sutawika Lindia Tjuatja Patrick Fernandes ... S. Carolin (Haas) Lawrence Aditi Raghunathan Kiril Gashteovski Graham Neubig 67 0 0 05 Mar 2025
Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Issey Sukeda ELM 32 1 0 18 Sep 2024
Interactive DualChecker for Mitigating Hallucinations in Distilling Large Language Models Meiyun Wang Masahiro Suzuki Hiroki Sakaji Kiyoshi Izumi VLM 14 0 0 22 Aug 2024
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling Wataru Nakata Kentaro Seki Hitomi Yanaka Yuki Saito Shinnosuke Takamichi Hiroshi Saruwatari AuLLM 43 0 0 22 Jul 2024
PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems Kentaro Mitsui Koh Mitsuda Toshiaki Wakatsuki Yukiya Hono Kei Sawada 36 4 0 18 Jun 2024
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition Yukiya Hono Koh Mitsuda Tianyu Zhao Kentaro Mitsui Toshiaki Wakatsuki Kei Sawada AuLLM 29 8 0 06 Dec 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 388 4,110 0 28 Jan 2022
$Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information$ Understanding Dataset Difficulty with $\mathcal{V}$ -Usable Information Kawin Ethayarajh Yejin Choi Swabha Swayamdipta 157 157 0 16 Oct 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts Soravit Changpinyo P. Sharma Nan Ding Radu Soricut VLM 273 1,077 0 17 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 245 1,977 0 31 Dec 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 226 4,424 0 23 Jan 2020
U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger Philipp Fischer Thomas Brox SSeg 3DV 232 75,445 0 18 May 2015