← Back to pipelines
Video Trim
Deterministic transcript-aware video trim: slice an SRT to your explicit [start, end] window and ffmpeg-cut the source to match. Snaps cuts to sentence boundaries and returns the windowed transcript rebased to the clip for downstream captioning. No LLM — the editorial pick belongs upstream (typically the highlights pipeline).
0.1
is a required property
Best for
- •Cutting long videos to an explicit window already chosen upstream
- •Chaining after highlights: highlights picks the moment, video-trim slices it
- •Producing a sentence-boundary-snapped clip plus its rebased windowed SRT in one call
When to use
- •Always after transcription + highlights (or any caller that already knows the window)
- •Before captions — the rebased windowed SRT in the output feeds the captions step directly
Tips
- ✓Pair with highlights (count=N, style=...) to get authoritative editorial picks
- ✓Cuts are snapped to SRT segment boundaries — pass a slightly tight window and let snapping widen it to a full sentence
- ✓The output transcript URL is already rebased to t=0; feed it straight into captions
Recipes using this pipeline
Video Trim — Deterministic Transcript-Aware Cut
Pass a source video, its SRT transcript, and an explicit [start_sec, end_sec] window. The pipeline snaps the window to real segment boundaries so cuts never land mid-sentence, ffmpeg-trims the source, and returns the windowed transcript rebased to t=0 — ready for the captions step to consume without re-transcribing.
How It Works
- Snap to segments — start/end are clipped to the nearest SRT segment boundaries
- Slice the SRT — kept segments are rebased to t=0 and uploaded as a windowed transcript
- FFmpeg trim — the source is cut to the snapped window with clean frame-accurate edges
Why No LLM Here
- The editorial decision (which moment is worth clipping) belongs upstream — typically the highlights pipeline
- Calling an LLM a second time after highlights only adds drift between the picked window and the rendered one
- One-call billing: a flat 0.1 credits per trim, no per-token reconciliation needed
Frequently Asked Questions
How is the window chosen?
You supply it. The highlights pipeline (or any other upstream picker) emits start_sec and end_sec; this trim consumes them deterministically. There is no LLM in the trim itself — the editorial decision lives in the picker.
Why does it require a transcript?
The transcript is what makes the cut snap to a full sentence instead of landing mid-word. The pipeline keeps every segment overlapping your window, takes the extent of the kept set, and trims the video to match — the same SRT is then sliced + rebased to t=0 and returned for the captions step to reuse.
Can it run without a transcript?
Not anymore. The previous keyframe-vision fallback was removed when highlights took over the editorial pick — it consistently drifted from the chosen window. Transcribe the source first (the transcription pipeline caches by source hash, so re-runs are free).
Does this cost credits?
0.1 credits per trim, flat.
What if start_sec/end_sec land outside the transcript?
The pipeline errors out rather than silently producing a degenerate cut. Pass a window inside the source duration.