Loading...
「ツール」は右上に移動しました。
利用したサーバー: wtserver1
79いいね 2,140 views回再生

XGen-7B: Long Sequence Modeling with (up to) 8K Tokens. Overview, Dataset & Google Colab Code.

Are Open LLMs any good when it comes to longer texts?

In this video, we'll XGen-7B for Long Sequence Modeling, an open LLM with 7B parameters by Salesforce. With an impressive 8K input sequence length and fine-tuning on public-domain instructional data, XGen-7B promises a competition against state-of-the-art LLMs. We'll look at performance on standard NLP benchmarks, long sequence modeling tasks, and code generation.

I'll take you through the process of loading the instruction model in a Google Colab Notebook and demonstrate its capabilities through various prompts. From answering simple questions to generating code and comprehending documents. How good this model is?

Discord:   / discord  
Prepare for the Machine Learning interview: https://mlexpert.io
Subscribe: http://bit.ly/venelin-subscribe

XGen blog post: https://blog.salesforceairesearch.com...
XGen HuggingFace repository: https://huggingface.co/Salesforce/xge...

Join this channel to get access to the perks and support my work:
   / @venelin_valkov  

00:00 - Introduction
00:55 - XGen Model
04:00 - Pre-training Data
06:20 - Training Methods
08:58 - Evaluation Results
11:57 - HuggingFace Repository
12:16 - Google Colab Setup
14:55 - Prompting XGen
19:43 - Writing Jokes
21:20 - Investing Advice
22:18 - Coding
23:40 - QA over Text
26:07 - Conclusion

Image by pch-vector

#chatgpt #gpt4 #llms #artificialintelligence #promptengineering #chatbot #transformers #python #pytorch

コメント