ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.15152
  4. Cited By
Reducing shared memory footprint to leverage high throughput on Tensor
  Cores and its flexible API extension library

Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library

International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia), 2023
29 August 2023
Hiroyuki Ootomo
Rio Yokota
ArXiv (abs)PDFHTML

Papers citing "Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library"

5 / 5 papers shown
Title
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM AccelerationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025
Shaobo Ma
Chao Fang
Haikuo Shao
Zhongfeng Wang
72
0
0
26 Aug 2025
Hardware-Efficient Attention for Fast Decoding
Hardware-Efficient Attention for Fast Decoding
Ted Zadouri
Hubert Strauss
Tri Dao
325
8
0
27 May 2025
Efficient Arbitrary Precision Acceleration for Large Language Models on
  GPU Tensor Cores
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor CoresAsia and South Pacific Design Automation Conference (ASP-DAC), 2024
Shaobo Ma
Chao Fang
Haikuo Shao
Zhongfeng Wang
277
5
0
26 Sep 2024
Mixed-Precision Random Projection for RandNLA on Tensor Cores
Mixed-Precision Random Projection for RandNLA on Tensor CoresPlatform for Advanced Scientific Computing Conference (PASC), 2023
Hiroyuki Ootomo
Rio Yokota
99
4
0
10 Apr 2023
Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and
  Automatic Precision Selection
Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and Automatic Precision SelectionInformation Security Conference (IS), 2023
Hiryuki Ootomo
Hidetaka Manabe
K. Harada
Rio Yokota
67
7
0
15 Mar 2023
1