@jeffwads

Due to the Deepseek model, I had to pay Tekboost a visit to grab one of their monster 1.5TB Z8 G4 dual Xeon 18core rigs for around 4K.  I need to have that full 128K context and don't care about the slower inference, as long as I get great results with the output.

@UCs6ktlulE5BEeb3vBBOu6DQ

QWEN2.5 72B Instruct has been my daily driver ever since it launched. Bang for the buck in code quality is unmatched. My favorite part is when you get errors in ChatGPT like «Too many concurrent connections» and I'm fully functional offline at home!

@NetmainRiadh

I used it to copy handwritten text, and it was really amazing, mind-blowing. All the other models either refused to work or just failed to correctly transcript the text and gave me unreadable text.

@DaveEtchells

Mind-blowing that a 72B model can code this well - and has vision too!

I haven’t looked, but how big is the model file if you were to download it (if you even can)? 

72B seems like it could run on something like a 128GB M4 Pro machine. 

The model companies are all moving slow on the agent front, but this and the R1 revelation is telling me that quite small local “thinking” models will be plenty capable of taking on the personal assistant role, checking email, managing appointments, taking meeting notes, etc, etc. The hangup is the potential screw-ups if they give the assistant full access to your computer and accounts, but pure capability-wise, we’re getting really really close to small, efficient and practical assistant agents running locally.

🤯

@pizzyfpv

I'm working on getting the qwen2.5vl 7b running locally for use with webui brower use, etc. hopefully on python 3.12 should get running takes a long time to build the wheel still waiting.

@OccultDemonCassette

I want a model that I can run locally that is really good at Vision based OCR. I'd assume they should be able to achieve a wonderful Visual OCR model under 72B. 8b or 14b would be my wheelhouse.

@UCs6ktlulE5BEeb3vBBOu6DQ

Also I do not use the newer 2.5 VL because I believe they had to prune the 2.5 Instruct model to make room for the vison stuff which I do not care for.

@JaysThoughts-q5e

12:55 Yeah, I gotta have API access.  I'm not a great programmer but I know how to loop things via the API.

@vncstudio

The VL is excellent for handwritten transcription and much better than Llama 3.2 Vision 90B. Just a tad below Google Pro Vision and Claude Sonnet. The challenge of handwritten transcription used to be very difficult before 2024.

@ChristineYe-yz7xb

Try 阿里云百炼

@maxxflyer

closed source and no api?

@xryanlee

So you can actually pay to load Qwen's API as well.

@ПотужнийНезламізм

Like the model but another Chinese hype 😄, are we secure to use it and save data on Chinese servers? 🧐