NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix Operations for Efficient Inference

23 May 2023

Jie Zhao

Papers citing "NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix Operations for Efficient Inference"

2 / 2 papers shown

Title
I-BERT: Integer-only BERT Quantization Sehoon Kim A. Gholami Z. Yao Michael W. Mahoney Kurt Keutzer MQ 86 332 0 05 Jan 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,927 0 20 Apr 2018