Multi-Head Attention¶
Entity Type: Glossary
ID: multi-head-attention
Definition: An extension of self-attention that runs multiple attention mechanisms in parallel, each focusing on different aspects of the input. The outputs are concatenated and projected to create richer representations in transformer models.
Related Terms: - self-attention - attention-mechanism - transformer - parallel-processing
Source Urls: - https://en.wikipedia.org/wiki/Attention_(machine_learning)#Multi-head_attention
Tags: - attention - transformers - parallel-processing
Status: active
Version: 1.0.0
Created At: 2025-08-31
Last Updated: 2025-08-31