Automatic Label Sequence Generation for Prompting Sequence-to-sequence
ModelsInternational Conference on Computational Linguistics (COLING), 2022 |
Transformers with Learnable Activation FunctionsFindings (Findings), 2022 |
Scaling Laws vs Model Architectures: How does Inductive Bias Influence
Scaling?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Long Range Language Modeling via Gated State SpacesInternational Conference on Learning Representations (ICLR), 2022 |
On the Parameterization and Initialization of Diagonal State Space
ModelsNeural Information Processing Systems (NeurIPS), 2022 |
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
KnowledgeNeural Information Processing Systems (NeurIPS), 2022 |
Rank Diminishing in Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2022 |
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERTConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 James Lee-Thorp Joshua Ainslie |
Life after BERT: What do Other Muppets Understand about Language?Annual Meeting of the Association for Computational Linguistics (ACL), 2022 |
UL2: Unifying Language Learning ParadigmsInternational Conference on Learning Representations (ICLR), 2022 |
What Language Model Architecture and Pretraining Objective Work Best for
Zero-Shot Generalization?International Conference on Machine Learning (ICML), 2022 |
Simple Baselines for Image RestorationEuropean Conference on Computer Vision (ECCV), 2022 |
ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise
Semantic Alignment and GenerationComputer Vision and Pattern Recognition (CVPR), 2022 |
PaLM: Scaling Language Modeling with PathwaysJournal of machine learning research (JMLR), 2022 |
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
in the Vocabulary SpaceConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Error Correction Code TransformerNeural Information Processing Systems (NeurIPS), 2022 |
IT5: Text-to-text Pretraining for Italian Language Understanding and
GenerationInternational Conference on Language Resources and Evaluation (LREC), 2022 |
Transformer Quality in Linear TimeInternational Conference on Machine Learning (ICML), 2022 |
VRT: A Video Restoration TransformerIEEE Transactions on Image Processing (IEEE TIP), 2022 |
A Multi-attribute Controllable Generative Model for Histopathology Image
SynthesisInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2021 |
Sequence-to-Sequence Piano Transcription with TransformersInternational Society for Music Information Retrieval Conference (ISMIR), 2021 |
Revisiting Deep Learning Models for Tabular DataNeural Information Processing Systems (NeurIPS), 2021 |
Distributed Deep Learning in Open CollaborationsNeural Information Processing Systems (NeurIPS), 2021 |
A Survey of TransformersAI Open (AO), 2021 |
Pay Attention to MLPsNeural Information Processing Systems (NeurIPS), 2021 |
The Power of Scale for Parameter-Efficient Prompt TuningConference on Empirical Methods in Natural Language Processing (EMNLP), 2021 |
Do Transformer Modifications Transfer Across Implementations and
Applications?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021 |