Questions On Mech Interp

Interpretation of machine learning models can be separated into model architectures… can it? The intuition of why that may be true, is that mechanistic interpretation of transformers is done on transformers. To know whether the technique can be applied to the general neural network architectures, I will have to first read the mech interp from anthropic, and then read other interpretability works.

Gaia Prime

Explorer

Questions On Mech Interp

Backlinks

Graph View