now that intel battlemage is out i bet they will be more price competitive with dedicated AI cores
Another helpful comparison video! Thanks for testing the different GPU's
Love these videos! Keep up the great work! I currently have a gaming pc with the 4090 im using for ai inference but will be building your setups starting with this one and then moving to the midsize before the monster quad gpu one!
I'm super glad I ran into this channel. Just subscribed also... I was thinking about dipping my toes into self hosting AI... This is very useful!
I was excited that you went from k2200 to m2000 to p2000. If you would have stopped at k2200 I would have been really disappointed.
20-80 watts? This means live 24/7 classification of persons on your Ring is not only technically feasible but also financially acceptable.
I would be interested to see how a tesla p4 or 2 does, especially as they are around $100 especially when compared to a 3060
My dual P40 + A2000 use 550w at idle lol Keeps me warm
I have two titan xp's languishing. They may have a new purpose now.
My 16GB 4060 TI clocks in around 31 tps on this model (single card used). I've seen these for around $400 USD, so price/performance ratio is on par, but overall system price is higher. And you get 16GB of VRAM, which is going to be the limiting factor with the cheaper cards even if the performance is OK for you.
The wattage does go up once you are running inferences loads. My AI Server i built out of gaming PC hardware idles at 18 watts but jumps to 160 watts when under load. Its running ollama on docker containers using a cheap 1660 Ti GPU. I plan on upgrading the GPU to a 3060 with 12GB of VRAM.
What about AMD gpus? Haven't they made progress for AI and cuda alternatives?
I tested the minicpm-v:8b on a gtx 1070 ~37 t/s, and on a rtx 3090 ~92 t/s. Using this prompt: "Help me study vocabulary: write a sentence for me to fill in the blank, and I'll try to pick the correct option." ~5,5gb vram. Default values. Tested with an image and prompt "explain the meme" using an image and got ~34t/s (gtx1070) and 97t/s (rtx 3090) the image was resized to 1344x1344
My measly 12GB 3060 is sooo tiny. Still rely on OpenAI for most of my stuff, but local automation is critical for synthetic training set generation. The other day I had an LLM make scripts for adding common errors to sentences and even slang or common expression replacements to help add variety to training sets. I suspect that training set generation using llms, especially lower parameter count llms on local systems can lead to a sort of repetition that thwarts deep learning to an extent. Care must be taken, as they say. Anyhow, thanks for the vid. Its inspiring me towards a private voice recognition server. It boggles my mind why we don't have stand alone inferencing machines by now. Well we have cell phones I guess.
19:53 - so would you recommend 3060's over 1080Ti's, or what kind of price would make 11GB Pascals an interesting value?
stoked i found your channel! I'm considering using Exo to distribute a llm across my families fleet of gaming pcs, however not sure on the overall power draw. Thoughts?
Would a770 be good for this? They’re like $200-300 and 16gb vram. But damn this is a sweet channel! Keep up the good work.
yeah I have to say I enjoy running 3x Rtx 3060 12bib cards. Pretty OK speed and space.
Thank you for this. I've been looking for ideas for a viable $200 - $400 ultra budget rig to get my feet wet. This is right in that range. lol
@dorinxtg