Local AI agent rig

Something intrigued me about running an Eleven labs clone via Quen3-TTS on a local raspberry pi 5 and an GTX card as mentioned by Adam Curry on No Agenda Episode 1856 1:19:00. Most recently watching Silicon Valley and building an Alexa skill, I thought it would be cool to add a Jin Yang voice to say desparaging morning inspiration to my 15 year old son.

Having a Manning subscription, I though I would look for books about the topic but didn’t find much. There is some books aobut Domain Specific small language models and CUDA for Deep learning, but not really much about building a rig.

After being completely confused on Reddit and EBay of what is what, I turned to YouTube. The most informative was this video with a metaphor for a chef (the GPU) and counter top (the VRAM) and how it is actually really all about the VRAM. He did also mention a Ras Pi but recommended it for small AI on the edge.

Want to Run AI Agents Locally? Here is The Bare Minimum Setup/Build - Daniel Jindoo

The final recommendation here was probably to go with a RTX 4060 Ti with 16GB of VRAM but once you add the ram, CPU and 2TB of disk it all starts adding up. And a Mac M4 can do a little bit of it, so maybe I at least start there.

Next up was

DONT Buy these GPU’s for Local AI! (learn from my mistake) - Ai Flux

This was again informative as it overviewed a lot of the rubbish out there but also pointed me to a few resources:

See what people are building at LlamaBuilds
for my conference talk idea wisperflow

Don’t type, just speak

The voice-to-text AI that turns speech into clear, polished writing in every app.
DO THIS and at ~ < $2/hour probably the most logical place to start, rent in the cloud with https://cloud.vast.ai/
there was also mention of a build on reddit of a multi card setup Reddit/LocalLLaMA

I do still have a dream of building a Ras Pi 5 powered AI even if it’s not the right thing

GPU-Powered Private AI on Raspberry Pi 5 – Vulkan Acceleration with RX 6700XT! - Jeffs Pi in the Sky

There are more Qwen3 articles and videos to review

https://medium.com/data-science-collective/high-quality-long-form-tts-with-qwen3-open-weight-models-cdd6e3d00df0
How to Clone Voice LOCALLY with Qwen3-TTS with ONE-CLICK Install - AsapGuide
Qwen3-TTS Tutorial: Open-Source Voice Design & Cloning - Thorsten-Voice
Qwen 3 TTS - How to Finetune and Install Locally - Jarods Journey
https://tinycomputers.io/posts/the-real-cost-of-running-qwen-tts-locally-three-machines-compared.html
Running Deepseek-R1 671B without a GPU - ServeTheHome
amazon instances
- https://aws.amazon.com/ec2/instance-types/g4/
https://io.net/blog/gpu-cluster
https://greennode.ai/blog/what-is-a-gpu-cluster

and in the end I need a no beeps 🔊 version of some training data

Jian-Yang’s Best Moments Silicon Valley Max - HBO Max