The Daily Vee: Building a Personal AI Podcast
· Vitor Pontual · 4 min read
I wanted a morning news briefing that covered exactly the topics I care about — tech, science, crypto, AI, and the occasional sports story — without the ads, the hot takes, or the 45-minute runtime. So I built one.
The Daily Vee is a fully automated podcast that runs every day on my homelab. It pulls stories from my AI news aggregator (The VP Journal), writes a two-host script using a local LLM, voices it with neural text-to-speech, stitches the audio together, and publishes the episode to Telegram, Audiobookshelf, and right here on this site. No cloud APIs. No subscriptions. All local.
How It Works
The pipeline has five stages:
1. Story selection. The Daily Vee pulls from The VP Journal, which clusters and scores news articles using embedding similarity and a custom “Gravity Engine.” Higher gravity stories get more airtime. On a typical day it picks the top 10 stories across all my feeds.
2. Script generation. A local LLM (currently Qwen 3 8B running on my Ollama fleet) writes the podcast script. Two hosts: Ava handles the facts — calm, measured, informative — and Rex adds short, witty commentary after each segment. The LLM gets the full article content and gravity scores, so it knows what to lead with and how deep to go.
3. Text-to-speech. This is where I got surprised. I expected robotic output. What I got was two distinct, natural-sounding voices having a conversation. The secret is edge-tts, a Python library that taps into Microsoft’s Azure Neural TTS — the same engine behind Cortana, Microsoft Read Aloud, and Edge browser’s text-to-speech. Microsoft offers over 400 neural voices across 100+ languages, and the quality is genuinely impressive. Ava uses en-US-AvaNeural and Rex uses en-US-AndrewNeural. The pacing, intonation, and emphasis are remarkably good for voices that cost exactly zero dollars.
4. Audio stitching. FFmpeg combines the segments with a jingle intro, speaker pauses between segments, and a sign-off (“We’ll Vee you tomorrow!”). The result is a 3-7 minute episode depending on the news day.
5. Publishing. The finished episode goes three places simultaneously:
- Telegram via Llama Rider, my AI agent — it sends the mp3 as an audio message
- Audiobookshelf on my homelab — SCP’d to the podcast library, then an API call triggers a library scan so it shows up instantly
- This website — copied to the Hugo static folder, committed, and pushed to trigger a Cloudflare Pages rebuild
The Stack
Everything runs on my home network:
- Python — the glue: feed fetching, LLM calls, TTS, audio stitching, publishing
- Ollama on a Jetson Orin AGX — runs the script-writing LLM
- edge-tts — Microsoft’s neural voices, free and unlimited
- FFmpeg — audio processing
- Audiobookshelf — self-hosted podcast player (with a nice mobile app)
- Hugo + Cloudflare Pages — the player you see on this homepage
- Llama Rider — my Telegram AI agent handles the daily trigger and delivery
The whole thing was built with some help from Claude (;-)). From the Audiobookshelf integration to the website player and episode metadata — all built collaboratively.
What I Learned
Microsoft’s TTS is shockingly good. I went in expecting to spend days tweaking prosody and SSML tags. Instead, the neural voices just… work. They handle emphasis, questions, lists, and even mild sarcasm naturally. The 400+ voice library means you can find voices with different accents, ages, and speaking styles. For a side project, the quality-to-effort ratio is unbeatable.
Local LLMs are good enough for structured writing. An 8B parameter model running on consumer hardware can reliably produce a podcast script with consistent formatting, two distinct character voices, and reasonable editorial judgment about which stories matter most. It’s not perfect — sometimes Rex tries too hard to be funny — but it’s surprisingly competent.
The best podcast is the one that’s exactly right for you. No algorithm is trying to maximize my engagement. No ads. No filler. Just the 10 stories I’d want to know about, in under 5 minutes, waiting for me every morning. That’s the real value of self-hosting: building tools shaped to your life instead of shaping your life around someone else’s tools.
You can listen to today’s episode right on the homepage — look for the player on the right side.
Subscribe
Listen in any podcast app — paste this feed URL:
Or use the subscribe button on the homepage player.