Fine-Tuning Text Embeddings For Domain-specific Search (w/ Python)

「ツール」は右上に移動しました。

利用したサーバー: natural-voltaic-titanium

330いいね 9177回再生

Fine-Tuning Text Embeddings For Domain-specific Search (w/ Python)

Get exclusive access to AI resources and project ideas: the-data-entrepreneurs.kit.com/shaw

In this video, I walk through how to fine-tune a text embedding model for domain adaptation using the Sentence Transfomers Python library.

Resources:
📰 Blog: shawhin.medium.com/fine-tuning-text-embeddings-f91…
💻 GitHub Repo: github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/fi…
🤗 Model: huggingface.co/shawhin/distilroberta-ai-job-embedd…
💿 Dataset: huggingface.co/datasets/shawhin/ai-job-embedding-f…

References:
[1]    • How to Improve LLMs with RAG (Overview + P...
[2]    • Text Embeddings, Classification, and Seman...
[3]    • Fine-Tuning BERT for Text Classification (...
[4] sbert.net/docs/sentence_transformer/training_overv…
[5] sbert.net/docs/sentence_transformer/training_overv…
[6] sbert.net/docs/sentence_transformer/pretrained_mod…
[7] sbert.net/docs/package_reference/sentence_transfor…

--
Homepage: www.shawhintalebi.com/

Intro - 0:00
RAG - 0:48
Problem with Vector Search - 2:25
Fine-tuning - 3:49
Why fine-tune? - 4:43
5 Steps for Fine-tuning Embeddings - 6:23
Example: Fine-tuning Embeddings on AI Jobs - 6:55
Step 1: Gather Positive (and Negative) Pairs - 7:53
Step 2: Pick a Pre-trained Model - 12:50
Step 3: Pick a Loss Function - 14:18
Step 4: Fine-tune the Model - 15:57
Step 5: Evaluate the Model - 18:00
What's Next? - 19:13

コメント