Table of Contents
Quick Answer
Edit a 10-minute video in 30 minutes with Descript for text-based editing, Runway for VFX, Pictory for b-roll, and CapCut for final assembly.
- Edit the transcript, not the timeline
- B-roll auto-matched by AI saves 2+ hours
- Auto-captions have replaced manual subtitling
What You'll Need
- Raw footage (MP4)
- Descript ($15/mo) for text-based editing
- Runway (free tier) for VFX
- Pictory or Supercreator for b-roll
- CapCut (free) for final polish
- YouTube/Instagram/TikTok account for publishing
Step 1: Transcribe Everything With Descript
Open Descript → drag in MP4 → auto-transcribes in 2 min. Now your video is a Google Doc. Delete words = delete footage.
Step 2: Cut Filler Words in One Click
Descript → Edit → Remove Filler Words → select "um, uh, like, you know." Auto-removes. Saves 30+ min of manual cuts.
Step 3: Remove Bad Takes by Highlighting Text
Highlight bad sentences in transcript → Delete. The video jump-cuts smoothly. No timeline scrubbing.
Step 4: Add B-Roll With AI Matching
Export transcript → Pictory → paste → AI suggests stock b-roll for every sentence. Review, approve, auto-inserts on timeline.
Step 5: Apply AI Effects With Runway
For VFX shots: open Runway → upload clip → use Gen-4 (text-to-video transform) or Motion Brush. Export → drop back into Descript.
Step 6: Fix Eye Contact With AI
Descript has "Eye Contact" feature → applies to any clip where you looked away from camera. AI redirects gaze to lens. Used carefully, it's magic.
Step 7: Generate Captions
Descript auto-generates → style in brand colors → export as .SRT or burned-in. YouTube/TikTok algorithms favor captioned video.
Step 8: Final Polish in CapCut
Export Descript → import to CapCut → add trending audio (muted under voiceover) → adjust color → export at highest quality for YouTube (4K, 60fps).
Common Mistakes to Avoid
- Keeping every "um" and breath (viewers click away)
- Using AI voice dubbing without disclosure (loses trust)
- Over-using b-roll (distracts from talking head)
- Ignoring pacing — cut every 5-7 seconds
- Publishing without captions (85% watch muted)
Top Tools
| Tool | Use Case | Free Tier | Best For |
|---|---|---|---|
| Descript | Text-based editing | Yes (1hr/mo) | Dialogue videos |
| Runway | AI VFX | Yes | Gen-4 motion |
| Pictory | Auto b-roll | Trial | Content videos |
| CapCut | Mobile + desktop editing | Yes | Social video |
| Opus Clip | Long-to-short | Yes (60 min) | Repurposing |
Conclusion
Video editing in 2026 is text editing. The creators shipping 5x more content aren't better editors — they're better at prompting AI tools. Learn Descript first.
Try Assisters free →
