@dreamsofcode

This is really awesome. I'm excited to give this a try.

@AhmedFalih-kj3tt

we want more like this content!!! keep going man! 😊😊💖💖

@thedevlebowski

This is something I have been interested in for quite awhile but haven't found any great resources to really sink my teeth into. Thanks for the great and informative content.

@flwi

Great overview video! Thanks for putting in the time to create such a nice gem!

@johnshelbyjenkins

I was so hyped when you showed my llm_client crate! Kalosm is great though!

@letsgetrusty

Peak Rust content 🔥🦀

@StephenBlum

Nice! Great to learn more Rust libs with LLM support. Kalosm 🎉

@virusblitz

Wow, there really are so many applications! Amazing video❤

@amr3162

This video feels shinier than usual.

@AhmedFalih-kj3tt

Very good crate!! nice video

@northicewind

Awesome content! Excited to try this crate. Thank you for sharing

@marcof1430

Python is the standard for Data science and ML (but technically all the big frameworks Torch, Keira's+TF are in other languages), but in data preparation, in production, Rust can say something. For in memory data Polars is really great, also I used in a project tokenization in Rust and it's impressive.

@ian.mubangizi

This is brilliant 🎉😮

@DavidChoiProgrammer

Such a good video!

@first-thoughtgiver-of-will2456

if you are an engineer and serious about LLMs in Rust learn Candle. Ive been able to research and develop optimizers and architectures with Candle. Its an involved framework but is feature rich and complete in more places than any other framework. It makes design decisions that are reminiscent of torch and tensorflow and is very low level and not polyglot compared to alternatives.

@DaM_Cdn

I think the sweet spot for using a library like Kalosm would be integrating it into a Tauri app and downloading + executing the ML model on the user's local machine. Pair Kalosm & Tauri with a Leptos frontend, and you have a really compelling native desktop (or mobile) app!

An installable PWA using a similar stack (minus Tauri of course) would also be a great use case for Kalosm + Leptos..

@sassan

Technically if you want to deploy an LLM powered application you would not just rely on a library like Transformers. You also need a performant server that can A. Do adaptive batching and B. Implement efficient “multiplications” to work optimally on multiple batches. So while kalosm looks dope, you would not want to use Rust if you actually need a python library like vLLM or NVIDIA’s triton (it’s oss). Folks at Kyutai (the company founded by the same guy who contributes to Candle) said they use rust for inference. So they must have implemented all of that in rust :)

But if you don’t need that much load, of course use kalosm. I’m pretty sure you can find a crate that implements batching for tokio.

@oof-software

Here's an interesting paper on imposing format restrictions on LLMs: "Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models"

@liamwoodleigh

Great video! A shame there’s no structure output for the LLM-as-Service companies like OpenAI. Thanks for distilling all this!

@mahor1221

Thank you for the great video!