How to Summarize YouTube Videos with OpenClaw
Someone shares a link to a 45-minute YouTube video with the message “you should watch this.” You want the gist without the time investment. OpenClaw with AgentPatch can fetch the transcript and give you a summary on the spot, right in your Telegram or Discord chat.
Why This Matters
Getting a useful summary of a YouTube video normally involves watching it, using a browser extension, or pasting the URL into a separate tool. With OpenClaw and AgentPatch, it’s a single message to your agent.
The transcript comes back with timestamps, which means the summary can be grounded — not a hallucinated overview, but one based on what was actually said. You can also ask follow-up questions about specific points or request that OpenClaw pull a particular section.
Setup
The AgentPatch CLI is designed for AI agents to use via shell access. Install it, and your agent can discover and invoke any tool on the marketplace.
Install (zero dependencies, Python 3.10+):
pip install agentpatch
Set your API key:
export AGENTPATCH_API_KEY=your_api_key
Example commands your agent will use:
ap search "web search"
ap run google-search --input '{"query": "test"}'
Get your API key from the AgentPatch dashboard.
Skill (Recommended)
Install the AgentPatch skill from ClawHub — it teaches OpenClaw when to use AgentPatch and how to use the CLI:
clawhub install agentpatch
MCP Server (Alternative)
If you prefer raw MCP tool access instead of the skill, add AgentPatch to ~/.openclaw/openclaw.json:
{
"mcp": {
"servers": {
"agentpatch": {
"transport": "streamable-http",
"url": "https://agentpatch.ai/mcp",
"headers": {
"Authorization": "Bearer YOUR_API_KEY"
}
}
}
}
}
Replace YOUR_API_KEY with your actual key from the AgentPatch dashboard. Restart OpenClaw and it discovers all AgentPatch tools automatically.
Example
Someone shares a 20-minute gardening video in your Telegram group. You ask your OpenClaw bot:
“Summarize this video for me: https://www.youtube.com/watch?v=EKBIni-eB1A”
OpenClaw calls the YouTube Transcript tool through AgentPatch, gets the full transcript, and sends back a summary like this:
“Lazy Potatoes” — Growing Potatoes With No Hilling or Top Dressing
The grower planted potatoes in a 10ft x 4ft raised bed using a “set it and forget it” approach — no hilling, no mid-season fertilizer. He dug 3 rows 16 inches apart, planted seed potatoes 4–5 inches deep, added a light dusting of organic fertilizer at planting, and walked away.
The plants grew aggressively — by day 42 they were so dense they acted as living mulch. But without hilling, plants flopped over and leaves turned pale from lack of additional fertilizer during tuber development.
Final harvest: ~60 lbs from 40 sq ft (1.5 lbs per square foot). He estimates hilling and top dressing could have pushed the yield to ~80 lbs (2 lbs/sq ft) — a 33% increase with relatively little extra effort.
You can follow up:
“What did he say about solanine and green potatoes?”
OpenClaw goes back to the transcript and gives you the relevant passage with the timestamp so you can verify it in the video if needed.
This works for any kind of video: product announcements, technical tutorials, podcast episodes uploaded to YouTube, keynote talks, course material.
Wrapping Up
Video summarization is one small part of what AgentPatch adds to OpenClaw. The same configuration gives your agent access to web search, email, image generation, and more — with no additional setup. Head to agentpatch.ai to grab an API key and see the full tool list.