By Jesse Lerman, President and CEO, TelVue Corporation
With accessibility gaining momentum both as a core mission and a compliance requirement, Closed Captioning is top of mind for Community Media broadcasters, and modern technology has made captioning affordable. As you move towards making your channels and programming accessible, what are the technology and workflow tradeoffs to consider?
Before reviewing some of the workflow considerations, let’s look at the two main types of captioning technologies available. Human captioning, where humans are scheduled to listen to the program audio and type out the captions, or Artificial Intelligence (AI) captioning, where audio is processed by computer programs and speech is automatically transcribed – also known as Automatic Speech Recognition (ASR) or Speech-to-Text (STT). Human captioning is highly accurate, but resource intensive and comes with a hefty price tag. AI captioning is automatic, highly scalable, with 80-90s % accuracy, and orders of magnitude more affordable.
For AI-based captioning solutions, the algorithms may run on dedicated hardware on-premise, or in the cloud. Cloud Speech-to-Text services have gone mainstream with major cloud computing providers including Google, IBM, Amazon, and Microsoft all offering services, as well as specialty providers. For both human and cloud-based AI captioning, it is common to stream audio and text over the Internet between the station and the cloud for processing. Audio and text are both low bandwidth, so this can easily be done reliably and with minimal latency.
Hardware captioning solutions can be quite expensive, with very high start-up capital expense in the $50k range. They also typically require caption encoders, which can easily cost an additional $4k to $6k per channel.
There is a growing trend in broadcast towards all-in-one systems, that simplify workflow, reduce complex integration points between varied systems, save money, extra rack space and power, and make support easier with one point of contact. Integrating captioning directly with playout and automation helps save money by eliminating the need for dedicated captioning hardware and traditional captioning encoders. Such integrated systems can provide hooks for using both AI captioning engines and human captioning. While AI captioning is so much more affordable and continues to get better, there may be times where specific events require the accuracy of human captioning.
For both human and AI captioning, quality audio with minimum background noise, and clear dialog without too many simultaneous speakers are all important for accuracy. AI captioning supports advanced features such as Custom Language Models to improve accuracy that allows configuring a list of commonly spoken names, words, and phrases so the automatic captioning knows what to expect. For example, a list of Council Member names for meetings.
Tightly integrating captioning directly with playout & automation streamlines captioning workflow, enabling:
Tight integration also allows for cost savings when combined with AI captioning engines and usage-based pricing by captioning unique programming only. Offline files and live events are processed once, saving the captions for replays. If you take replays and CBB out of the equation, you will likely find that your yearly usage-based caption cost will be extremely affordable. Grants and Sponsors for captioning can further offset costs.
AI captioning can also be used to solve more nuanced caption workflow challenges, such as Secondary Audio Programming (SAP) in which case there may be two separate audio tracks that both need captioning in different languages, or using AI language Translation services to present captioning in multiple languages. For live events on Cable, that could mean CC1 in English, and CC2 in Spanish, even if there is only one English-only audio track. For streaming & OTT, that could be any of 100+ languages you select that are important to your community.
Captioning your programming opens up additional benefits beyond accessibility. Once you have captions available, the caption transcript text can be useful as searchable metadata. Viewers of your on-demand programming can search the captions to find specific topics of interest in your meetings and programming, and drill right to that part of the video. Some AI captioning also supports Speaker Diarization that automatically detects and tags each individual speaker, potentially useful for meetings and transcripts.
TelVue recently launched SmartCaption™ LIVE to make captioning for Community Media broadcasters easy and affordable. SmartCaption integrates directly with the TelVue HyperCaster, no additional equipment required, providing the many all-in-one benefits. SmartCaption also supports a standalone server for simple integration with other playout systems, offering many of the same workflow and automation advantages. In both cases, usage-based pricing leveraging modern, AI-based captioning helps you control your costs and makes multi-language captioning ultra-affordable for live and offline captioning supporting both broadcast and streaming/OTT workflows.
Captions generated with SmartCaption are also compatible with TelVue CloudCast that displays captions in the web player, mobile (iOS & Android), and OTT apps (Roku, Apple TV, and Fire TV). The CloudCast web player and mobile apps support Caption Transcript Search to make your meetings searchable, and captions can be translated to over 100 languages to allow viewers to select their language of choice. Captions can be embedded in broadcast files as standard CEA-608/708 in Connect for JAG Media Exchange sharing. SmartCaption also offers a powerful, cloud-based caption editor when quickly touching-up existing captions is required.
Now is a great time to start captioning your programming for accessibility and to better serve everyone in your community. The pricing for captioning is finally PEG-friendly, there are numerous options available, and the technology is ready for prime-time.