Skip to article frontmatterSkip to article content

Poly-semanticity

Definition

Poly-semanticity refers to the phenomenon in neural networks where individual neurons or features encode multiple distinct concepts or meanings simultaneously. This property is particularly observed in large language models and deep neural networks, where single components (neurons, attention heads, or feature dimensions) respond to or represent multiple semantic concepts, making interpretation and analysis of these networks more complex. Understanding poly-semanticity is crucial for model interpretability and optimization.

Tags

Interpretability, Neural Networks, Semantics, Model Analysis, Representation Learning

References

  • Elhage, N., Nanda, N., Olsson, C., Henighan, T., Joseph, N., Mann, B., Askell, A., Ndousse, K., Jones, C., DasSarma, N., Hernandez, D., Drain, D., Ganguli, D., Chen, Z., Hatfield-Dodds, Z., Kernion, J., Nova, T., Lovitt, L., Sellitto, M., Kundu, S., ... Kaplan, J. (2022). Transformer Circuits Thread. https://transformer-circuits.pub/
  • Cammarata, N., Goh, G., Carter, S., Petrov, M., Schubert, L., Gao, C., ... & Olah, C. (2020). Thread: Circuits. Distill. Cammarata et al. (2020)
References
  1. Cammarata, N., Carter, S., Goh, G., Olah, C., Petrov, M., & Schubert, L. (2020). Thread: Circuits. Distill, 5(3). 10.23915/distill.00024