ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.08746
  4. Cited By
Seeing is Believing: Brain-Inspired Modular Training for Mechanistic
  Interpretability

Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability

4 May 2023
Ziming Liu
Eric Gan
Max Tegmark
ArXivPDFHTML

Papers citing "Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability"

31 / 31 papers shown
Title
Machine Learning meets Algebraic Combinatorics: A Suite of Datasets Capturing Research-level Conjecturing Ability in Pure Mathematics
Herman Chau
Helen Jenne
Davis Brown
Jesse He
Mark Raugas
Sara Billey
Henry Kvinge
38
0
0
09 Mar 2025
Mixture of Experts Made Intrinsically Interpretable
Xingyi Yang
Constantin Venhoff
Ashkan Khakzar
Christian Schroeder de Witt
P. Dokania
Adel Bibi
Philip H. S. Torr
MoE
49
0
0
05 Mar 2025
Modular Training of Neural Networks aids Interpretability
Modular Training of Neural Networks aids Interpretability
Satvik Golechha
Maheep Chaudhary
Joan Velja
Alessandro Abate
Nandi Schoots
74
0
0
04 Feb 2025
Harmonic Loss Trains Interpretable AI Models
Harmonic Loss Trains Interpretable AI Models
David D. Baek
Ziming Liu
Riya Tyagi
Max Tegmark
97
2
0
03 Feb 2025
Machines and Mathematical Mutations: Using GNNs to Characterize Quiver
  Mutation Classes
Machines and Mathematical Mutations: Using GNNs to Characterize Quiver Mutation Classes
Jesse He
Helen Jenne
Herman Chau
Davis Brown
Mark Raugas
Sara Billey
Henry Kvinge
23
3
0
12 Nov 2024
A mechanistically interpretable neural network for regulatory genomics
A mechanistically interpretable neural network for regulatory genomics
Alex Tseng
Gökçen Eraslan
Tommaso Biancalani
Gabriele Scalia
19
0
0
08 Oct 2024
Training Neural Networks for Modularity aids Interpretability
Training Neural Networks for Modularity aids Interpretability
Satvik Golechha
Dylan R. Cope
Nandi Schoots
25
0
0
24 Sep 2024
KAN 2.0: Kolmogorov-Arnold Networks Meet Science
KAN 2.0: Kolmogorov-Arnold Networks Meet Science
Ziming Liu
Pingchuan Ma
Yixuan Wang
Wojciech Matusik
Max Tegmark
45
61
0
19 Aug 2024
Quantum Algorithms for Compositional Text Processing
Quantum Algorithms for Compositional Text Processing
Tuomas Laakkonen
K. Meichanetzidis
Bob Coecke
CoGe
36
1
0
12 Aug 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong-jia Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
49
28
0
22 Jul 2024
Using Degeneracy in the Loss Landscape for Mechanistic Interpretability
Using Degeneracy in the Loss Landscape for Mechanistic Interpretability
Lucius Bushnaq
Jake Mendel
Stefan Heimersheim
Dan Braun
Nicholas Goldowsky-Dill
Kaarel Hänni
Cindy Wu
Marius Hobbhahn
29
7
0
17 May 2024
KAN: Kolmogorov-Arnold Networks
KAN: Kolmogorov-Arnold Networks
Ziming Liu
Yixuan Wang
Sachin Vaidya
Fabian Ruehle
James Halverson
Marin Soljacic
Thomas Y. Hou
Max Tegmark
78
473
0
30 Apr 2024
Mechanistic Interpretability for AI Safety -- A Review
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
40
111
0
22 Apr 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution
  in Large Language Models
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
Carlo Nicolini
Jacopo Staiano
Bruno Lepri
Raffaele Marino
MoE
26
1
0
13 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for
  Large Language Models
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
22
3
0
28 Feb 2024
Hyperdimensional computing: a fast, robust and interpretable paradigm
  for biological data
Hyperdimensional computing: a fast, robust and interpretable paradigm for biological data
Michiel Stock
Dimitri Boeckaerts
Pieter Dewulf
S. Taelman
Maxime Van Haeverbeke
W. Criekinge
B. De Baets
29
2
0
27 Feb 2024
Generating Interpretable Networks using Hypernetworks
Generating Interpretable Networks using Hypernetworks
Isaac Liao
Ziming Liu
Max Tegmark
27
2
0
05 Dec 2023
Neural Network Pruning by Gradient Descent
Neural Network Pruning by Gradient Descent
Zhang Zhang
Ruyi Tao
Jiang Zhang
19
4
0
21 Nov 2023
Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open
  Challenges and Interdisciplinary Research Directions
Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions
Luca Longo
Mario Brcic
Federico Cabitza
Jaesik Choi
Roberto Confalonieri
...
Andrés Páez
Wojciech Samek
Johannes Schneider
Timo Speith
Simone Stumpf
29
189
0
30 Oct 2023
Codebook Features: Sparse and Discrete Interpretability for Neural
  Networks
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin
Mohammad Taufeeque
Noah D. Goodman
30
27
0
26 Oct 2023
Growing Brains: Co-emergence of Anatomical and Functional Modularity in
  Recurrent Neural Networks
Growing Brains: Co-emergence of Anatomical and Functional Modularity in Recurrent Neural Networks
Ziming Liu
Mikail Khona
Ila R. Fiete
Max Tegmark
31
12
0
11 Oct 2023
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Arshia Soltani Moakhar
Eugenia Iofinova
Elias Frantar
Dan Alistarh
32
1
0
06 Oct 2023
Extreme sparsification of physics-augmented neural networks for
  interpretable model discovery in mechanics
Extreme sparsification of physics-augmented neural networks for interpretable model discovery in mechanics
J. Fuhg
Reese E. Jones
N. Bouklas
AI4CE
21
22
0
05 Oct 2023
A Neural Scaling Law from Lottery Ticket Ensembling
A Neural Scaling Law from Lottery Ticket Ensembling
Ziming Liu
Max Tegmark
11
4
0
03 Oct 2023
The semantic landscape paradigm for neural networks
The semantic landscape paradigm for neural networks
Shreyas Gokhale
21
2
0
18 Jul 2023
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
212
494
0
01 Nov 2022
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
56
76
0
03 Oct 2022
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
458
0
24 Sep 2022
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
122
317
0
21 Sep 2022
Dynamics of specialization in neural modules under resource constraints
Dynamics of specialization in neural modules under resource constraints
Gabriel Béna
Dan F. M. Goodman
26
0
0
04 Jun 2021
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
185
1,027
0
06 Mar 2020
1