Fine-tuning Falcon 7b turorial (requires MLExpert Pro): https://www.mlexpert.io/prompt-engineering/fine-tuning-llm-on-custom-dataset-with-qlora Full text tutorial: https://www.mlexpert.io/prompt-engineering/faster-llm-inference
Thanks for taking the time to put this video together. Really informative and helping me grasp these concepts.
Bro solved my brain in a single video
You rule! Thanks so much for doing these videos on LLMs
Thank you for your great videos! Very informative and sharp to the point!
Thank you for your effort, your videos are great and direct into practical use. Great work!
Very Informative. Now please make tutorials on Llama Index as there is lot of buzz around it
Thank you for another great video!
Very informative
Thank you for the great video. Is there a place for subscribers to ask questions if they join?
Thank you so much for this great video, I have some doubts and would be great if you help me understand, 1. The temperature parameter does not change the response at all. Did i do something terribly wrong. 2. The method presented in lit-parrot works with base model of falcon. But how to load the model that we trained using QLoRa? Thanks again for such amazing content.
Hello, Great video so far. Let me ask some questions here: 1. What should I do if my training loss is not decrease consistently (sometimes up, sometimes down) ? 2. How to use multiple GPU with QLORA ? I always get OOM if I use Falcon-40B, so I rented 2 GPUs in cloud provider. Unfortunatelly, it ran just for 1 GPU.
Great video! Just curious, why do have the - - NORMAL - -, - - VISUAL - - , - - INSERT - - when you click on each code block? Some functionalities from google ?
Hi Venelin, I just subscribed to your website. Can you tell me if there is a way to limit answers to adapter data only and not the base model?
can you put a open source languag3e model , example like llama open source implementation. to understand the actual implmentation ? for a beginer , stnadfor5ld alpaca or others are all tuned on the existing model , but is there anything like llama to really understand from ground up, i see gpt neo or nanogpt. but they are not actual llama implmentations.
Thanks for the informative video! One question: Why does loading in 8 bit takes less time than 4 bit? Shouldn't it be the other way around, since 8 bit format has higher precision?
How can we use falcon 7b for summarization tasks?
Content is really informative , It save almost a week for me :-) . One quick question , I was trying to load Falcon 40b from google drive , but it showed me the error . HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/content/drive/MyDrive/falcon/model/'. Use `repo_type` argument if needed. Any possible suggestion . It will be a great help
Hi Venelin! Great work but a quick question. I'm confused. How is the load_in_8_bit better than load_in_4_bit and Qlora b&b in terms of execution time? Isn't the loading in 4bit has to be faster than loading in 8bit? Please clarify!
@sherryhp10