How I Automated My Podcast Discovery Using LLM (And Saved Hours Weekly!)

Bartek Witczak
Bartek Witczak

Every week I prepare summaries of most interesting podcasts’ episodes. I have a list of favourite podcasts. But scanning through each podcast manually? Tedious. I needed a smarter solution. I wrote a simple script leveraging an LLM to automate fetching recent episodes.

Here's why this is perfect for an LLM-assisted programming:

  • The task is clearly defined.
  • No complex integrations are needed.
  • It’s small and self-contained.

Let me walk you through my thought process & how I solve it. (For the context, I’m using Cursor, so my prompts are already “augmented”).

Prompt determins the outcome

Good prompts lead to better results. My prompts usually follow a clear structure - Master the Perfect ChatGPT Prompt Formula (in just 8 minutes)!:

  • Persona
  • Task
  • Context
  • Example
  • Format

Let’s collect pieces for recent episodes script:

  • Taks: Create a script for fetching recent episodes
  • Format: JSON list + which attributes do i want to use
  • Context: API endpoint + list of podcast ids
  • Example: API Docs provide example response
  • (Persona is added automatically by Cursor)

That’s the prompt I used:

Prepare function to fetch 5 recent episodes for each podcast. That's a podcast list ids. Url for fetching last episode is something like @https://api.podcastindex.org/api/1.0/episodes/byfeedid?id=75075&max=5

Response example is
{
"status": "true",
"items": [
{
"id": 16795088,
"title": "Batman University",
"link": "https://www.theincomparable.com/batmanuniversity/",
"description": "Batman University is back in session! James Thomson and Nathan Alderman join Tony for a discussion of Fox’s “Gotham.” Tune in to hear our thoughts on how a half-baked comic book show was reborn into an unmissable train wreck.",
"guid": "incomparable/batman/19",
"datePublished": 1546399813,
"datePublishedPretty": "January 01, 2019 9:30pm",
"dateCrawled": 1598369047,
"enclosureUrl": "https://www.theincomparable.com/podcast/batmanuniversity302.mp3",
"enclosureType": "audio/mp3",
"enclosureLength": 26385472,
"duration": 54,
"explicit": 0,
"episode": 19,
"episodeType": "full",
"season": 3,
"image": "https://www.theincomparable.com/imgs/logos/logo-batmanuniversity-3x.jpg?cache-buster=2019-06-11",
"feedItunesId": 1441923632,
"feedUrl": "https://feeds.theincomparable.com/batmanuniversity",
"feedImage": "https://www.theincomparable.com/imgs/logos/logo-batmanuniversity-3x.jpg?cache-buster=2019-06-11",
"feedId": 75075,
"podcastGuid": "9b024349-ccf0-5f69-a609-6b82873eab3c",
"feedLanguage": "en-us",
"feedDead": 0,
"feedDuplicateOf": 75075,
"chaptersUrl": "https://studio.hypercatcher.com/chapters/podcast/http:feed.nashownotes.comrss.xml/episode/http:1322.noagendanotes.com",
"transcriptUrl": "https://mp3s.nashownotes.com/NA-1322-Captions.srt",
}
...
],
"count": 1,
"query": "75075",
"description": "Found matching feed"
}

I need guid, datePublished, enclosureUrl, duration, name, link, description.

That’s the result.

type Episode = {
guid: string
datePublished: number
enclosureUrl: string
duration: number
title: string
link: string
}

type EpisodesResponse = {
status: string
items: Episode[]
count: number
}

const EPISODES_API_URL =
'https://api.podcastindex.org/api/1.0/episodes/byfeedid'

async function getRecentEpisodes(
id: number,
max: number = 5
): Promise<Episode[]> {
const apiHeaderTime = Math.floor(Date.now() / 1000)
const hash = crypto
.createHash('sha1')
.update(`${API_KEY!}${API_SECRET!}${apiHeaderTime}`)
.digest('hex')

const response = await fetch(`${EPISODES_API_URL}?id=${id}&max=${max}`, {
headers: {
'X-Auth-Date': apiHeaderTime.toString(),
'X-Auth-Key': API_KEY!,
Authorization: hash,
'User-Agent': 'PodcastWebsite/1.0',
},
})

const data = (await response.json()) as EpisodesResponse

if (data.status !== 'true' || !data.items) {
console.error(`Error fetching episodes for podcast ${id}: Invalid response`)
return []
}

return data.items.map(item => ({
guid: item.guid,
datePublished: item.datePublished,
enclosureUrl: item.enclosureUrl,
duration: item.duration,
title: item.title,
link: item.link,
}))
}

You can ask. How did it get authentication right?

Since I’m using Cursor, few more things are added automatically. I’ve defined both my global cursor rules (Cursor -> Settings => Cursor Settings => Rules => User Rules) & .cursorrules file defined for the project itself.

I used Cursor’s Manual mode targeting existing script. I had already defined (or LLM-generated to be honest) authentication details for previous task, so LLM could reuse that piece.

It took me several minutes to obtain JSON list of recent episodes. And I didn’t stop there…

Looking through JSON with recent episodes isn’t convenient. I quickly thought about friendly way to browse through recent episodes. Maybe I should build admin tool? Nah… That’s too much. There’s a simpler way. I’m already using Notion to manage OwlCast project & schedule upcoming summaries. There needs to be a simple way to ingest data into Notion database & get episodes browser for “free” (that’s another strong argument for using Notion).

4 hours later...

I’ve built recent episode viewer/tools/integration/automation in less than 4 hours.

tmrw-fix

Was everything perfect? No.

Was everything running without manual changes? No.

Is it state of the art? No.

Is code perfect, decoupled, reusable? No.

Does it solve the problem? YES.

And that’s the most important thing. We, developers, often forget about why we write code in the first place. It’s to solve the problem.

How much time would it take to code everything by myself? That’s a good question. I estimate 3x more.