Live Transcription in Live Streams (HLS)
This is a guide to enable live transcription for your live streams. This feature is currently in Beta.
Getting Started
100ms is excited to offer auto-generated live transcription in English. The following flowchart illustrates the entire workflow of 100ms live transcription.
Note that the live transcription is only supported for the live streams' audience (consuming via HLS). It doesn’t appear for the webRTC viewers.
Enabling and Configuring Live Transcription
Creating a new template
- On the 100ms Dashboard, click on ‘Create Template’.
- Select ‘Live Streaming’ as your use-case and click on ‘Configure’.
-
Configure the template by answering the set of questions on the first screen and click 'Select Add-ons'.
-
Select 'Yes' to the question 'Do you require live transcription for your live streams?'.
-
Deploy the template and proceed further. On the final screen, use the different roles available to experience the
Read up more on the live streaming experience here.
Editing an existing template
-
Go to the template configuration
-
Click on ‘Transcription (Beta)’ tab.
-
Enable ‘Live Transcription’ toggle.
Live stream can be started using the Live Streaming REST API, 100ms SDK or through 100ms Prebuilt.
Advanced Configuration
There are two advanced configuration that are currently offered.
- Custom Vocabulary: Add non-dictionary words like names, abbreviations, slang, and technical jargon which may not be recognised by the AI model for better transcription accuracy.
- Language: Configure the primary spoken language that has to be transcribed. This will hint the AI model to perform transcription more accurately. Currently, only English is supported, which is the default. Support for more languages will follow soon.
Consumption
Prebuilt
Our Prebuilt player on web supports HLS with live transcription out of the box. It allows the functionality of enabling and disabling this feature using the CC
button on the player.
Learn more about live streaming here and get started with integrating Prebuilt using this.
Within your player
The transcript .vtt
files are delivered as part of the HLS manifest itself. If a HLS compatible player is used, it should support closed captioning by default.
All that is required is to input the URL to the master manifest into a compatible HLS player.
The functionality of turning live transcription on or off for the user should be supported by the player itself.
Example Output
The captions are generated in the webvtt
format and are plugged in the HLS manifest itself.
Master manifest (master.m3u8)
Following is the format of the master manifest file for HLS.
#EXTM3U #EXT-X-VERSION:6 #EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",LANGUAGE="en-US",NAME="English CC (auto)",AUTOSELECT=YES,DEFAULT=YES,URI="subtitles/subtitles.m3u8",FORCED=NO,CHARACTERISTICS="public.accessibility.transcribes-spoken-dialog" #EXT-X-STREAM-INF:RESOLUTION=1280x720,BANDWIDTH=1720400,FRAME-RATE=20.000,SUBTITLES="subs" stream_0/stream.m3u8 #EXT-X-STREAM-INF:RESOLUTION=854x480,BANDWIDTH=1005400,FRAME-RATE=20.000,SUBTITLES="subs" stream_1/stream.m3u8 #EXT-X-STREAM-INF:RESOLUTION=640x360,BANDWIDTH=620400,FRAME-RATE=20.000,SUBTITLES="subs" stream_2/stream.m3u8
Subtitles manifest
Following is the format of the manifest file for subtitles.
#EXTM3U #EXT-X-VERSION:6 #EXT-X-MEDIA-SEQUENCE:428 #EXT-X-ALLOW-CACHE:NO #EXT-X-TARGETDURATION:2 #EXTINF:2, output00001.vtt #EXTINF:2, output00002.vtt #EXTINF:2, output00003.vtt #EXTINF:2, output00004.vtt #EXTINF:2, output00005.vtt
Webvtt
file format
WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:0 00:00:00.000 --> 00:00:01.022 position:50.00% size:80.00% align:middle If you if you're 00:00:01.022 --> 00:00:02.000 position:50.00% size:80.00% align:middle If you if you are at some event here and you shoot the light ray, 00:00:02.000 --> 00:00:03.065 position:50.00% size:80.00% align:middle If you if you are at some event here and you shoot the light ray,
Frequently Asked Questions (FAQ)
-
Do live captions work within 100ms conferencing rooms (WebRTC based)?
Currently, live transcription is only supported within our HLS streams.
-
How many languages are supported?
Presently, only English language is supported. But support for other popular languages like French, Portuguese, Spanish and more is coming soon.
-
Is live translation also supported?
Live translation is not supported right now.
-
What happens if multiple languages are being spoken in the live stream?
The bits which are spoken in the selected language will be transcribed. There might be a few hallucinations though.
-
When can I edit my live transcription configuration?
Live transcription configuration can be edited either on the room template, or at the start of a live stream.
-
What all players are supported?
Almost all HLS players will support
webvtt
based transcripts. We have internally tested with hls.js, AV Player, Avo player and Exoplayer. -
Do you support speaker labels with the live transcription?
No, we don’t support speaker labels in live transcription. We do offer speaker labelled transcription when the recording is transcribed post call. Refer to this documentation for more details on post call transcription (with speaker labels) and AI-generated summaries.
-
Does RTMP-ingested media support live transcription?
Yes, live streams where the media source is RTMP-in supports live transcription.
-
What is the expected latency or delay in generation of the transcript?
This will be between 500 ms to 1 s.
-
Is this a paid feature?
Yes, live transcription is a paid feature. We offer 300 free minutes of live transcription in live streams per month. You can check your monthly usage on the dashboard in 'Usage Overview' section. Additionally, check out the pricing here.