A Window Into LLMs | Sparse Autoencoders Explained

「ツール」は右上に移動しました。

利用したサーバー: wtserver1

234いいね 2,596 views回再生

A Window Into LLMs | Sparse Autoencoders Explained

This has been my favorite video so far to make! I think interpretability is so important both in terms of ensuring safe AI and also making our AI models more useful to humans.

I recommend reading these papers:
Toy Models of Superposition: https://transformer-circuits.pub/2022...
Towards Monosemanticity: https://transformer-circuits.pub/2023...
Scaling Monosemanticity: https://transformer-circuits.pub/2024...

A Window Into LLMs | Sparse Autoencoders Explained

コメント