Subtitle insertion GEP100 - HEP100 Inserting 3Gb/s, HD, subtitles SD embedded and Teletext domain with the Dolby HSI20 E to module PCM decoder with audio shuffler A A application product note COPYRIGHT 2011 COPYRIGHT 2012 AXON DIGITAL DESIGN B.V. ALL RIGHTS RESERVED NO PART OF THIS DOCUMENT MAY BE REPRODUCED IN ANY FORM WITHOUT THE PERMISSION OF AXON DIGITAL DESIGN B.V.
Introduction Sub ti tle [súb tit l]: noun, verb, -tled, -tling 1. captions for foreign-language film, a printed translation of the dialogue in a foreign-language film, usually appearing at the bottom of the screen 2. printed words for hearing-impaired, the printed text of what is being said in a television program, provided for the hearing-impaired and usually at the bottom of the screen. Subtitling of program content is increasingly expected by the viewers; those with hearing impairment want to be able to appreciate the audio content, media sourced from a different country may not be produced in the language of the audience and therefore requires translation. Not all viewers might want to see the subtitles, so these are commonly sent as data and are converted into text and inserted into the picture by the viewer s receiving equipment. Subtitle Preparation TV programming falls into two basic categories, pre-recorded (taped) and live. The creation of subtitles for pre-recorded shows tends to occur before the time of transmission whereas live programs require the creation of subtitles in real-time as the show is transmitted. Live programs often have an element of scripted content, for instance the anchored section of newscasts, these scripts can be used to generate some of these subtitles. Pre-Recorded Preparation All subtitle preparation involves transcribing the spoken word to text, possibly in a different language to that spoken, and adding a description of other sounds, such as music and door slams, at the appropriate moment. For pre-recorded programming this process can be undertaken in an off-line environment, the Subtitler is provided with a frame accurate copy of the program, often as a low bit-rate proxy file made at the time of the final edit or when the material is ingested at the playout operation. They watch and listen to this file, translate the speech if required and either type the words directly into a subtitle workstation, e.g. Screen WinCAPS Qu4ntum, or use a speech recognition program and re-speak the words of the actors as the program is played. Because this process is not live any errors can be HD and SD video streams are formed of a series of pictures, transmitted one after another. Each picture is formed of a series of lines, some of which are not used to carry the picture information. In SD signals this is called the Vertical Blanking Interval (VBI) and was originally used to provide a period of black corresponding to the time taken for the electron beam to be deflected to the top of the glass CRT screen following the end of displaying the previous picture. HD signal have reserved a similar period in the video stream to carry data, called VANC (Vertical Ancillary Data). Both of which can be used to convey subtitle data from the broadcaster to the viewer. corrected by stopping playback and editing the captured text. Many subtitle workstations have inbuilt checks for reading speed, spelling, text formatting etc. The file that is produced by this process will be forwarded to the playout operation to be transmitted alongside the associated program material. Live Preparation Live subtitle preparation requires a different approach. It might be possible to obtain scripts for sections of the program however the words spoken by the presenters may vary and there will be sections, if not all, of the program for which there will be no prior knowledge of the spoken word. In these cases the Subtitler will need to listen to the program audio and then type the words and descriptions, or translation, into a subtitling workstation e.g. a Screen WinCAPS Q-Live, this requires a high level of keyboard skills. An alternative approach is to re-speak the words and descriptions, a speech recognition program will then convert the speech to text for use by the subtitling application. The use of speech-to-text applications means that a Subtitler is not required to have fast and accurate typing skills however the application will need to be trained to understand their speech patterns. page 2
This approach is very useful for simultaneous translation of live programming allowing skilled translators to focus on providing an accurate translation without being limited by their typing speed. Live preparation workstation systems often allow for multiple Subtitlers to work together on a single program and thereby share the workload. Although speech-to-text applications are constantly improving TV programs tend to have background noise, music and often more than one voice speaking simultaneously making automatic transcription impossible using today s technology. The data from the subtitle workstation is sent directly to the playout operation for inclusion in the channel s output video stream. This approach to subtitle creation does not automatically produce a file, although the subtitle data can be captured in a file and this can be used if the program is later rebroadcast. Subtitle Transmission Transmission of Pre-prepared subtitles Pre-prepared subtitle files have to be transmitted at the correct time relative to the content of the program they relate too so that subtitles are displayed at the same time as the words are spoken. Subtitles do not tend to be a continuous stream of text; instead they are discrete packets of words normally with breaks between phrases or sentences. Within the overall subtitle file each subtitle has a time-stamp which is referenced to the timecode of the original program. At the time the program is transmitted the playout automation system instructs the subtitling system, such as the Screen Polistream, to retrieve the file from its store and load it into the playout application. The automation also loads and cues the program on a video server, when the program is scheduled to be transmitted the automation plays the clip from the server, the clip s timecode sent to the subtitle playout system. As the program is transmitted the subtitle control system matches program timecodes to the timestamps in the subtitle file, when a match is made subtitle data is sent to be embedded into the video stream by the HSI20 module. Similar processes can take place at the time an original program is ingested into the video server; the ingest system controls a subtitle playout system to insert the subtitle data into the VBI (SD) or VANC (HD) and this is stored along with the program s video and audio in a the file created at the time of ingest. When the video clip is played-out the subtitles are already striped, or embedded, into the video and are therefore automatically transmitted. Transmission of live subtitles Live subtitles, for simultaneous translation or for the hearing impaired, are added to the video stream as they are created. The automation s playlist, or manual intervention by an operator, will signal to the subtitle control system that live subtitles are required for the next event. The subtitle control system will route the output of a subtitle workstation directly to the HSI20 insertion device. As subtitles are created, either by typing or automatic conversion of speech, the workstation forms these into subtitle data packets with each group being sent when triggered by the operator pressing Return or a dedicated Send key. A variation of this process is that the system sends each word as it is completed resulting in subtitles being sent (and being displayed) quicker than waiting for the speaker to complete a sentence. With the introduction of speech recognition technology the operator may not being using a keyboard, in this scenario the system waits for words to be recognized before sending the subtitle. The example below shows Softel Swift Create workstations used to produce subtitle files and live feeds and a Softel Swift TX handling these files and live feeds under automation control. In both cases the subtitle data is forwarded to a Synapse HSI20 module for inclusion into the video stream as either Teletext (SD) or OP47 (HD). page 3
Example 1: subtitle preparation and insertion overall block diagram The HSI20's inserter can also be remotely controlled by using Cue information sent as Packet-31 information (X31) which is also know commercial insertion cue tones. These are Cue inserted into either the VBI (SD) or VANC (HD) using a Synapse HSI21 module, or Softel inserter operating in Plain-View mode, and detected by the HSI20, when the cue is off (or is not present) the card will start to insert subtitle data. The HSI20 also supports the insertion of filler packets when no subtitles are present, creating a continuous stream of subtitle data which may aid the decoding of subtitles especially in older equipment. page 4
Synapse HSI20 module HSI20 HD, SD-SDI OUTPUT HD, SD-SDI INPUT 1 EQ CUE TONE DECODER TO TRIGGER SUBTITLES VBI/VANC DATA INSERTER WSS-VI- S2016 INSERTI ON ACTIVE VIDEO EMBEDDED AUDIO ANC BYPASS HD, SD-SDI OUTPUT ETHERNET PROCESSOR ENCODER CVBS PREVIEW CVBS OUTPUT RACK CONTROLLER INTERNAL SYNAPSE BUS The HSI20 provides an ideal interface between many manufacturers subtitle preparation and playout applications and live video streams. The HSI20 uses the proven Axon VBI and VANC data insertion capabilities used in many other modules in the Synapse range. The module follows the standard form-factor, fitting into any of the frames in the Synapse range alongside any of the other 300+ different processing modules and utilizing dual power supplies in 2RU and 4RU frames. External subtitles from a subtitle playout system or live subtitling workstation are sent to the module in the industry standard Newfor protocol via a dedicated Ethernet connection. The module encodes and inserts these subtitles as either WST in SD or OP47 in HD. The user has full control of VBI/VANC line assignment, insertion of filler packets and other transmission. Example 2: Cortex control screen for HSI20 showing HD and SD inserter configuration elements ad controls for video standard, WSS insertion and Bypass Ethernet connection HD inserter WST SD inserter Subtitle page and control bit page 5
Example 3: Cortex control screen for HSI20 X31 (cue) Decoder X31 (Cue) decoder, these could be configured to match the related encoder Control. In 'Bypass' the HSI20 functions as a subtitle inserter. If set to 'X31 Control' the insertion is turned on when a Cue is not received. page 6