Input example
approved script + final voiceover audio for this publishing workflow
Workflow guide
Forced alignment is how TimedSubs converts a finished script and voiceover into a timed subtitle file. Instead of transcribing audio to guess the words, it takes the words you already approved and finds where each one appears in the audio.
Input example
approved script + final voiceover audio for this publishing workflow
Output asset example
SRT/VTT subtitle assets plus quality notes for downstream upload or editor handoff
Common review point
Late narration edits shift subtitle timing against the approved script.
Decision points
Forced alignment takes a text input and an audio file, then locates each word in the audio stream to assign a precise timestamp. The output is a timed subtitle file where every line comes from your script, not from a transcription guess.
Generic auto captions start from the audio and work backwards to text — which means speech recognition errors, name misspellings, and changed product terms end up in your subtitle file. Forced alignment starts from your text and works forward to timing, so the wording is locked from the start.
Forced alignment is the right approach when you already have an approved script, TTS-generated voiceover, product demo narration, or course content where the exact wording has been signed off. If you are still editing the script, use the Script + Audio workflow after the script is final.
Practical workflow
Finalize your approved script text (TXT, MD, or plain text).
Upload the script and matching voiceover audio to TimedSubs.
Review alignment results, resolve any quality issues, and export SRT, VTT, or other supported formats.
Product boundary
Forced alignment requires both a script and matching audio. If you only have audio, TimedSubs is not the right tool — use a transcription service first.
FAQ
Yes. Transcription starts from audio and generates text using speech recognition, which can change wording silently. Forced alignment starts from your approved text and uses audio only for timing. The words in your subtitle file are the words you submitted — not what a model guessed from the recording.
TimedSubs flags the mismatch as a review note rather than silently correcting the script. You can see which lines have timing confidence issues, check the audio at that point, and decide whether to re-record, adjust the script, or accept the deviation. The original script text stays intact unless you explicitly change it.