We dive deep into the concept of Self Attention in Transformers! Self attention is a key mechanism that allows models like BERT and GPT to capture long-range dependencies within text, making them powerful for NLP tasks. We’ll break down how self attention in transformers works, looking at the math of how it generates a new word representation from embeddings. Whether you're new to Transformers or looking to strengthen your understanding, this video provides a clear and accessible explanation of Self Attention in Transformers with visuals and complete mathematics.
➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Timestamps:
0:00 Intro
1:13 The Problem
4:00 Self Attention Overview
6:04 Self Attention Mathematics - Part 1
19:20 Self Attention as Gravity
20:07 Problems with the equation
26:51 Self Attention Complete
31:18 Benefits of Self Attention
34:30 Recap of Self Attention
38:53 Self Attention in the form of matrix multiplication
42:39 Outro
➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Follow my entire Transformers playlist :
📕 Transformers Playlist: • Transformers in Deep Learning | Introducti...
➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖
✔ RNN Playlist: • What is Recurrent Neural Network in Deep L...
✔ CNN Playlist: • What is CNN in deep learning? Convolutiona...
✔ Complete Neural Network: • How Neural Networks work in Machine Learni...
✔ Complete Logistic Regression Playlist: • Logistic Regression Machine Learning Examp...
✔ Complete Linear Regression Playlist: • What is Linear Regression in Machine Learn...
➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖
コメント