skillmake
← marketplace
creatorsconceptsha:76331c8943f91388manual

shorts-from-long-video

Use when slicing a long video into 3–5 highlight short clips with hook detection, safe-area subtitles, and 9:16 reframing — transcript-driven cuts via Whisper plus FFmpeg encoding.

Tutorials · creator-attached
One-line install
curl --create-dirs -fsSL https://skillmake.xyz/i/shorts-from-long-video -o ~/.claude/skills/shorts-from-long-video/SKILL.md

The hash above pins this exact content. The file we serve at /api/marketplace/shorts-from-long-video-76331c89/raw always matches sha:76331c8943f91388.

3,270 chars · ~818 tokens
---
name: shorts-from-long-video
description: "Use when slicing a long video into 3–5 highlight short clips with hook detection, safe-area subtitles, and 9:16 reframing — transcript-driven cuts via Whisper plus FFmpeg encoding."
source: https://ffmpeg.org/documentation.html
generated: 2026-05-07T21:42:56.064Z
category: concept
audience: creators
---

## Tutorials

- https://skillmake.xyz/v/shorts-from-long-video.mp4

## When to use

- Multiplying a single long-form video into Shorts/Reels/TikToks
- Finding the best 30–90s moments without watching the whole video
- Producing 9:16 reframes from a 16:9 master with safe-area captions
- Batch-generating clip variants for A/B testing across platforms

## Key concepts

### hook detection

Run an LLM over the timestamped transcript to score each 30–90s window for hook-worthiness: clear payoff, an opinion, a number, a story, surprise. Take the top N segments — typically 3–5 per hour of source. Avoid windows that depend on prior context the viewer won't have.

### 9:16 reframe

Turning a 1920×1080 master into a 1080×1920 short. Either crop-center (works for talking heads), follow the speaker (face-detected via OpenCV/MediaPipe), or letterbox top + caption bottom (safe default when there's no clear subject).

### safe-area captions

Mobile UI overlays the top ~250 px and bottom ~300 px of a 9:16 frame; captions go in the middle 1300px-tall band. Burn them in with FFmpeg's drawtext or render to SRT and rely on the platform; burned-in is safer for reach.

## API reference

```
ffmpeg crop + scale to 9:16
```

Center-crop a 1920×1080 source to 1080×1920 by scaling first then cropping the wide axis.

```
ffmpeg -i in.mp4 -vf "crop=1080:1920:(in_w-1080)/2:0" -ss 00:01:23 -t 60 -c:v libx264 -crf 18 -c:a aac out.mp4
```

```
ffmpeg drawtext for burned-in captions
```

Burn each caption line at a specific timestamp range, in the safe middle band of a 9:16 frame.

```
ffmpeg -i in.mp4 -vf "drawtext=fontfile=/System/Library/Fonts/Supplemental/Arial.ttf:text='caption':fontcolor=white:fontsize=56:box=1:boxcolor=black@0.5:boxborderw=20:x=(w-text_w)/2:y=h*0.55:enable='between(t,1.5,4.0)'" out.mp4
```

```
hook-detection prompt template
```

Single-shot LLM prompt that takes the timestamped transcript and returns a JSON list of {start, end, hook} candidates ranked by score.

```
Score every 30–90s window of this transcript on these axes: clear payoff (0–3), opinion strength (0–3), surprise (0–3), self-contained (0–3). Return JSON [{startSec, endSec, score, hook, reason}]. Top 5 only.
```

## Gotchas

- Mid-sentence cuts kill watch time — extend the window to the next sentence boundary even if it pushes past 90s.
- Center-crop fails for two-person interviews; fall back to face tracking or letterbox-with-caption.
- Burned-in captions must use a font installed on the renderer, not a font name only — FFmpeg silently substitutes Arial.
- TikTok/Shorts strip files >60s on upload via API; respect the platform max even if your hook is 75s.
- Audio normalisation matters for Reels (target -14 LUFS) — FFmpeg's loudnorm filter handles it in two passes.

---
Generated by SkillMake from https://ffmpeg.org/documentation.html on 2026-05-07T21:42:56.064Z.
Verify against source before relying on details.

File: ~/.claude/skills/shorts-from-long-video/SKILL.md