Back to pipelines

Video Trim

Deterministic transcript-aware video trim: slice an SRT to your explicit [start, end] window and ffmpeg-cut the source to match. Snaps cuts to sentence boundaries and returns the windowed transcript rebased to the clip for downstream captioning. No LLM — the editorial pick belongs upstream (typically the highlights pipeline).

0.1
is a required property

Best for

  • Cutting long videos to an explicit window already chosen upstream
  • Chaining after highlights: highlights picks the moment, video-trim slices it
  • Producing a sentence-boundary-snapped clip plus its rebased windowed SRT in one call

When to use

  • Always after transcription + highlights (or any caller that already knows the window)
  • Before captions — the rebased windowed SRT in the output feeds the captions step directly

Tips

  • Pair with highlights (count=N, style=...) to get authoritative editorial picks
  • Cuts are snapped to SRT segment boundaries — pass a slightly tight window and let snapping widen it to a full sentence
  • The output transcript URL is already rebased to t=0; feed it straight into captions

Recipes using this pipeline

Video Trim — Deterministic Transcript-Aware Cut

Pass a source video, its SRT transcript, and an explicit [start_sec, end_sec] window. The pipeline snaps the window to real segment boundaries so cuts never land mid-sentence, ffmpeg-trims the source, and returns the windowed transcript rebased to t=0 — ready for the captions step to consume without re-transcribing.

How It Works

  1. Snap to segments — start/end are clipped to the nearest SRT segment boundaries
  2. Slice the SRT — kept segments are rebased to t=0 and uploaded as a windowed transcript
  3. FFmpeg trim — the source is cut to the snapped window with clean frame-accurate edges

Why No LLM Here

  • The editorial decision (which moment is worth clipping) belongs upstream — typically the highlights pipeline
  • Calling an LLM a second time after highlights only adds drift between the picked window and the rendered one
  • One-call billing: a flat 0.1 credits per trim, no per-token reconciliation needed

Frequently Asked Questions

How is the window chosen?
You supply it. The highlights pipeline (or any other upstream picker) emits start_sec and end_sec; this trim consumes them deterministically. There is no LLM in the trim itself — the editorial decision lives in the picker.
Why does it require a transcript?
The transcript is what makes the cut snap to a full sentence instead of landing mid-word. The pipeline keeps every segment overlapping your window, takes the extent of the kept set, and trims the video to match — the same SRT is then sliced + rebased to t=0 and returned for the captions step to reuse.
Can it run without a transcript?
Not anymore. The previous keyframe-vision fallback was removed when highlights took over the editorial pick — it consistently drifted from the chosen window. Transcribe the source first (the transcription pipeline caches by source hash, so re-runs are free).
Does this cost credits?
0.1 credits per trim, flat.
What if start_sec/end_sec land outside the transcript?
The pipeline errors out rather than silently producing a degenerate cut. Pass a window inside the source duration.

Explore more pipelines

See all →
Video Generator
12–96
Video Generator
Image Generator
0.5–7
Image Generator
Audio Generator
0.6–1.2
Audio Generator
Music Generator
from 3
Music Generator