Audio Engineering Society. Convention Paper. Presented at the 128th Convention 2010 May London, UK
|
|
|
- Dinah Poole
- 10 years ago
- Views:
Transcription
1 Audio Engineering Society Convention Paper Presented at the 128th Convention 21 May London, UK The papers at this Convention have been selected on the basis of a submitted abstract and extended precis that have been peer reviewed by at least two qualified anonymous reviewers. This convention paper has been reproduced from the author's advance manuscript, without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request and remittance to Audio Engineering Society, 6 East 42 nd Street, New York, New York , USA; also see All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society. Loudness Normalization In The Age Of Portable Media Players Martin Wolters 1, Harald Mundt 1, and Jeffrey Riedmiller 2 1 Dolby Germany GmbH, Nuremberg, Germany [email protected], [email protected] 2 Dolby Laboratories Inc., San Francisco, CA, USA [email protected] ABSTRACT In recent years, the increasing popularity of portable media devices among consumers has created new and unique audio challenges for content creators, distributors as well as device manufacturers. Many of the latest devices are capable of supporting a broad range of content types and media formats including those often associated with high quality (wider dynamic-range) experiences such as HDTV, Blu-ray or DVD. However, portable media devices are generally challenged in terms of maintaining consistent loudness and intelligibility across varying media and content types on either their internal speaker(s) and/or headphone outputs. This paper proposes a nondestructive method to control playback loudness and dynamic range on portable devices based on a worldwide standard for loudness measurement as defined by the ITU. In addition the proposed method is compatible to existing playback software and audio content following the Replay Gain ( proposal. In the course of the paper the current landscape of loudness levels across varying media and content types is described and new and nondestructive concepts targeted at addressing consistent loudness and intelligibility for portable media players are introduced. playback level across theaters and across different content; a true win-win situation for consumers, movie 1 INTRODUCTION theaters and content owners alike. The movie industry was probably the first to identify and solve the problem of varying mixing & playback levels of content. Both, content producers as well as movie theaters had a genuine interest in finding a solution that would re-produce audio content at a level pleasant to consumers. Since then the corresponding SMPTE recommendations guarantee a rather consistent The situation in broadcast was already more challenging, given that the individual playback systems in the homes are not controlled by technicians and due to the more complex distribution channels and networks for broadcast. With the introduction of digital broadcast the industry established the concept of time-varyingmetadata which enables to control gain-values at the
2 receiving end to tailor content to a specific listening environment. An example is the metadata included in AC-3 / E AC-3 / HE AAC 1 which includes general loudness normalization information (called dialnorm in AC-3 and E AC-3, and program_reference_level in HE AAC) as well as gain-words to reduce the dynamic range of a program (called dynrng and compr in AC-3 and E AC-3).[1] Such systems are specifically powerful for situations where the operating modes in a receiver are also specified, like the so called line mode and RF mode for AC-3 and E AC-3. These technologies are also part of today s AV discs like DVD and BluRay. The most important distribution channel for audio content is still the CD which contains 16-bit PCM data unfortunately without any loudness or dynamic range metadata. The peak-normalization typically used for CD s is said to be the main reason for the so called loudness war which has led to extremely reduced dynamic range of audio content with high average levels [2]. Many authors, mixing engineers and audio enthusiasts alike, have complained about this situation and several proposals have been made to improve the situation. Solutions range from voluntary metering and leveling practices on the production side similar to the mentioned SMPTE recommendations (for example [3]) up to branding of content on the consumer side (for example [4]). At the same time, consumer behavior changed over recent years with coded content (e.g. content in datareduced formats such as mp3) becoming more popular and important for content distribution and storage. Such formats allow for virtually unlimited dynamic range which content owners and audio enthusiasts would like to take advantage of. In addition, the increasing popularity of mobile phones as personal media players has created new challenges in designing high quality playback devices that meet customer expectations of consistent leveling and best audio quality under various listening conditions. The large number of content in personal music collections (often exceeding thousands of files) as well as the broad range of audio formats such as mp3, HE AAC, OGG, WMA, AC-3, or E AC-3 further complicate the matter. 1 AC-3 is also known as Dolby Digital, E AC-3 as Dolby Digital Plus and HE AAC as Dolby Pulse. 2 GENERAL CONSIDERATIONS FOR LOUDNESS NORMALIZATION IN PORTABLE MEDIA PLAYERS Today there are already many proposals, components, ideas, and approaches available for loudness normalization in consumer devices. However, none of these proposals is widely supported in portable media players and none specifically addresses the challenges in such devices. Before explaining the proposed solution, the key stakeholders and their expectations with regard to loudness normalization are identified and the basic building blocks of a loudness normalization system are explained. 2.1 Identifying stakeholders and their expectations Consumers In general consumers would like to be presented with audio content in a form that matches their listening environment, whether it is being played back on a highend multi-channel setup or on the mono speaker of their mobile phone. The listening environment influences the desired output level as well as the maximum dynamic range tolerance [5]. For portable devices this most often means: loud to address the physical limitations of the playback device and little dynamic variations to adapt to the environmental noise. The average consumer also does not want to understand technical details such as the specific codec used for data compression. Instead, the playback experience should be equally well for all of the content. It might be acceptable to use different versions of the same content tailored to the different listening environments. However, this requires careful trade-offs with other requirements such as file size (e.g. number of hours of content on a limited storage) and download time. With content moving freely between devices, a single version of the same content is certainly the preferred solution Content owners and distributors There are probably still content owners out there that would like their content to playback the loudest. However, in general the desire matches more that of consumers: Playback all content at the same desired output level and with desired dynamic range not being limited by the distribution format. Of course, variations in level must be possible. For complete albums some Page 2 of 17
3 tracks might specifically be recorded above or below the album average, and certain parts of a song or movement might be louder or quieter than the average for the complete song. An automatic gain control (AGC) in the distribution channel is usually less preferred as it also overrides such intended level changes. Certainly, content owners are aware of the varying dynamic range tolerances of consumers and are willing to take these limitations into account. However, they would like to be able to control the final output to the consumer. Thus, they prefer a predictable signal chain for their content so that they can mix, setup parameters. and monitor accordingly. Both, content owners as well as distributors do favor a limited number of versions of the same content. A single version is preferred but several exceptions are common: High vs. low-bitrate versions, sometimes specifically downmixed versions vs. multi-channel originals, radio edit vs. album version etc Device manufacturers Obviously, the device manufacturer s heart is closest to the consumer who buys the devices. Therefore, meeting the customer s desire for playback at the appropriate level and within the tolerable dynamic range has highest priority. Single-sided solutions are an interesting option as they also provide the opportunity to differentiate own products from competitive offerings. Compatibility to a large share of available content is high on the priority list. Device manufacturers are also concerned about computational complexity. Hence, off-loading some of the signal processing to the encoding stage is preferred. Overall system design becomes a critical part of product design as well. Ensuring that the user experience and user interface is independent of the content format is a challenge. Simple, codec-agnostic solutions support the device manufacturer in addressing the consumer s desire for systems that work seamlessly across all their content. The complexity of the system design also increases with the versatility of today s devices: A mobile phone with built-in speakers, headphone output, and the ability to connect to a high-end AV system by means of a cradle demands a sophisticated control of the audio output stage that needs to be aware of the current listening environment. Last but not least, appropriate test materials to ensure interoperability between encoding and decoding systems are an essential requirement for device manufacturers. Recent investigations (not yet published) have shown that many open-standard-codecs and features suffer from the lack of appropriate test vectors and procedures, resulting in numerous incompatibilities between devices. 2.2 Identifying technical components These requirements lead to the following components and design restrictions for a loudness normalization system on portable media players: The preference for a non-destructive solution The authors strongly believe that the desire to limit the number of different versions of content together with the various listening environments that need to be supported demand a non-destructive solution for the problem at hand. This seems to be in the best interest of consumers, content owners, distributors, and device manufacturers alike. Non-destructive means no changes to the actual PCM signal (or payload) prior to the decoding stage and hence requires the use of metadata, potentially combined with single-ended solutions, instead Applying gains to match the desired target output The basic solution is obvious: Define a target output level, determine the actual level of the content, and apply a matching gain value during playback. It also seems obvious that a solution needs to take into account potential clipping in cases where the content needs to be boosted to match the target. In order to address the stakeholder s expectations, the definition of a target output level becomes a critical piece in defining a solution. The average consumer would like to be presented with the best solution, not worrying about selecting from too many options. The content owners prefer a predictable behavior too many options require monitoring content with too many different setups. Device manufacturers also favor fewer options (which in turn means a simpler system design). However, the specific output target reference level might be highly dependent on the device s capabilities which can in turn lead to a large variation across devices. Page 3 of 17
4 Given the fact that the proposed solution will rely on metadata to be present in the content one also needs to take into account, how the system behaves for content without the required metadata present, e.g. legacy content Conveying the loudness level Systems for streaming and broadcasting like AC-3, E AC-3, or HE AAC typically rely on transmitting a timevarying 2 value representing the loudness level of the current program to the decoding device (e.g. the dialnorm parameter). This allows content owners to easily control the complete signal chain and reduces the computational complexity on the decoding device. For file-based systems such a value typically does not change for a given file. While it is still possible to utilize loudness levels encoded into the payload, systems have been designed that rely on a single value per file. itunes is using such a value, called Sound Check. Little is known about the specific algorithms used by Apple. However, it is known that only certain file types are supported: Sound Check works with.mp3,.aac,.wav, and.aiff file types. It does not work with other file types that itunes can play. [6] Another proposal from 21 which got some traction across formats, implementations, and devices is called Replay Gain. [7] It does specify an algorithm to compute a gain value to normalize loudness across tracks and albums and proposes a storage format. However, the actual storage format used was not formally standardized so far and no test vectors are available to date. Replay Gain is implemented for example in the freely available software aacgain [8] (version 1.8. was used throughout our investigations) or in the WINAMP plug-in ml_rg.dll. Matlab code is given on Determining loudness Loudness itself is a highly subjective quantity (and as such, cannot be measured directly) that involves psychoacoustic, physiological, and other factors which have been and continue to be thoroughly studied to this 2 While this time-varying loudness value is carried in every bitstream syncframe, it is recommended to set this value on a program-by-program basis whereby the value itself is static for the duration of each (single) program and/or file. day. Hence, this highly subjective quantity (loudness) often results in substantial differences in loudness perception between listeners, making a single measurement method that considers all of the above factors for all individuals incredibly complex. This is borne out by real-world experience, as there is often no single loudness level that will satisfy all listeners (or even a single listener) all of the time. At best, we are only able to approximate the loudness of sounds by artificial means. One study performed by Dolby Laboratories concluded that even when audio programming has been normalized by a group of people by ear the normalized programs do not completely satisfy a different group of listeners 1 percent of the time. In fact, the different groups only agreed approximately 86 percent of the time. Given this level of uncertainty among groups of listeners, any loudness measure we utilize will not guarantee that we satisfy 1% of the listening audience. However, what we can achieve is to agree on a common loudness measurement such that different encoding tools at least behave consistently. This in turn will maximize consumer satisfaction since content from different sources will behave similarly. 3 THE PROPOSED SOLUTION Given the considerations above, the authors recommend to implement loudness normalization in portable media players based on a worldwide accepted standard for loudness measurement as defined by the ITU by conveying such loudness values as Replay Gain equivalents specifically for formats that do not support loudness metadata in the actual payload, and by setting a new target output reference level for portable devices while controlling the dynamic range either with single-sided algorithms or combining a single-sided limiter with dynamic range control derived from already available metadata. Following some more details and explanations about aspects of the the proposal: Page 4 of 17
5 3.1 Loudness Measurement & Normalization according to ITU-R BS.177 As stated earlier, over the past few years consumers have continued to demand support for a wider range of media source types on their portable devices (i.e. mobile phones) including music as well as multimedia files such as TV, movie content and their own user generated content. Moreover, the convergence of several media sources/types creates new challenges for portable device manufacturers, service operators and content creators alike as media sources like those listed above are typically created with different production and/or normalization philosophies (since they are destined for very different playback environments/devices, etc). In July 26, after several years of work, the ITU-R approved and published a new recommendation for estimating the loudness of broadcast programs entitled, ITU-R Rec BS.177 Algorithms to measure audio programme loudness and true-peak audio level. [9] ITU-R BS.177 defines a royalty-free and lowcomplexity method to estimate the overall loudness of an audio program by computing the frequency weighted energy average over time. Most importantly, it yields a single loudness value that represents the overall loudness of an entire program (i.e. AV or Music track file) thereby making automated loudness estimation and normalization of media files much simpler and (in many cases) non-destructive without (or with a much lower) impact to artistic intent when compared to real-time loudness controllers and dynamic range processing. The worldwide acceptance of BS.177 has grown considerably since its introduction and is currently referenced in documentation published by well known groups such as the Advanced Television Systems Committee (ATSC) [1] and the European Broadcasting Union (EBU). Furthermore, the use of BS.177 is a mandatory provision within a significant (and growing) number of program delivery specifications for many of the world s most influential broadcasters and content providers (for example [11]). One of the most significant benefits of the BS.177 method is its ability to work very well with estimating the loudness across a wide variety of content types. Including, common channel configurations such as mono, stereo and multichannel programming as well as content such as music, television, movies, sport and sound effects. Thus, making it a perfect candidate for use in estimating & normalizing the loudness of a wide variety of media files destined for portable media players. During the ITU-R study, subjective testing showed that the BS.177 algorithm yielded the best performance among several other methods including algorithms based on psychoacoustic models. Prior to the release of BS.177 the ITU-R study group conducted formal subjective tests in order to create a subjective database (containing perceived loudness ratings for each test item) that would be utilized to evaluate candidate loudness measurement algorithms and their ability to predict the results from the subjective tests. The subjective database itself included a wide range of audio material including television & movie dramas, sporting events, news broadcasts, sound effects and music. Several candidate algorithms were submitted for consideration and in the end, the results demonstrated that a reasonably simple weighted meansquare measure 3 worked extremely well on mono, dual mono, stereo, as well as multichannel audio programming and yielded an overall correlation of r =.977 across 336 test sequences from the subjective database. A bit more on the mathematical machinery under the hood of ITU-R Rec. BS.177, the algorithm was derived from, and is based on the Leq(RLB) algorithm described by Soulodre in [12] to support mono, stereo and multichannel audio signals while retaining a very low computational complexity. Thus, allowing it to be easily implemented and/or adopted by many equipment manufacturers at a very low cost. Figure 1 (page 6) shows a high-level functional block diagram of the ITU-R algorithm taken from ITU-R Rec BS.177. The BS.177 algorithm can be scaled to accommodate any number of audio channels contained within programs, however for multichannel 5.1 audio programs, it is assumed that the multichannel signals conform to the ITU-R BS channel configuration. As a side note, the LFE channel is NOT included in the measurement. 3 Ultimately became ITU-R Rec. BS.177 Loudness Algorithm. Page 5 of 17
6 Figure 1 - ITU-R Loudness Meter/Algorithm Functional Block Diagram Within the BS.177 algorithm, each of the individual audio channels being measured is first passed through two filters in cascade. The pre-filter, which has a highfrequency shelving characteristic, is to account for the acoustical effects of the human head (modeled as a rigid-sphere) while the RLB filter (Revised Lowfrequency B) is a modified version of the standard IEC B-weighting curve and has a characteristic where the low frequency response falls between the IEC C and B weighting curves. Figure 2 shows the combined response compared to A-weighting. Once the input signal for each channel has been filtered, the mean-square energy for each channel is computed for the entire measurement interval (time). The individual channel powers are then weighted in accordance to the angle of arrival [Refer to Table 3 in Relative Level (db) Frequency (Hz) Figure 2 - ITU-R BS.177 Frequency Weighting (solid) vs. A- weighting (dashed) [9]] and then linearly summed to provide the overall (final) loudness value as follows: ITU Loudness = log1 G i z i, in db i (1) The weighting blocks (G i ) in Figure 1 above for the Left, Center and Right channels are always 1. (db) where the weightings for the Left Surround and Right Surround channels are always 1.41 (~ + 1.5dB) each. The emphasis on the surround channels acknowledges the fact that sounds arriving from behind (the listener) could conceivably be perceived as being louder relative to those arriving from the frontal direction (see [9] for details). The result is the loudness given in LKFS (LKFS indicates a K-weighted level relative full scale). Note, the constant value (-.691) in the equation above is a calibration constant that addresses the combined effects of the Pre-filter and RLB filter at 1kHz. (Note the small filter gain (positive) at 1kHz in ) This constant ensures that a mono full scale sine wave is assigned a loudness of -3.1 LKFS. 3.2 Conveying loudness values as Replay Gain equivalents In general, the basic Replay Gain proposal as well as the way it is supported in various products today are very much in line with the requirements identified during our investigations. Instead of coming up with yet another syntax to transmit loudness values for filebased content, the authors suggest to embrace the current Replay Gain format in order to speed up adoption in the market. However, given that the authors are in favor of the ITU-R BS.177 algorithm which N Page 6 of 17
7 was not available at the time of the original Replay Gain proposal and which probably would have been chosen otherwise the authors suggest matching the Replay Gain semantics to ITU-R BS.177 results based on a statistically derived linear equation Matching the Replay Gain syntax Replay Gain consists of two values: peak signal amplitude and gain adjustment. It can be calculated on a track-by-track basis, or album-by-album basis. Track-based values are more suited for use cases and playlists where tracks from different albums are mixed. Album-based values are more suited for use cases where all tracks of an album are played consecutively. The originator of the Replay Gain values can be specified as the engineer/artist/producer, user, or automatically determined. Since the authors were not able to identify a formal specification of the actual syntax used to store Replay Gain values, several files with Replay Gain support have been analyzed. Based on these results the following syntax is proposed: Content stored in files compliant to the MPEG-4 file standard shall use itunesstyle metadata. Other formats shall store Replay Gain values in ID3v2 tags. The Replay Gain adjustments shall be between -18 db and +13 db (corresponding to a range of loudness values from to -31 LKFS). Values outside this range shall be clamped to 18 db and +13 db Replay Gain in itunes-style metadata Replay Gain metadata shall be added as an extension box of type ----, conforming to standard itunesstyle metadata. A mean box shall be present within the ---- box and contain the meaning org.hydrogenaudio.replaygain A name box shall be present within the ---- box and contain the name of the value : - replaygain_track_gain - replaygain_track_peak - replaygain_album_gain - replaygain_album_peak A data box shall be present within the ---- box and contain the value in the following formats: - The gain adjustments shall be written as a db floating-point value with 2 decimal places and a leading -/+. (e.g db"). - The peak signal amplitudes shall be written as a floating-point value (e.g ). The peak signal amplitude may (and often is) over 1. Players should match only on the value in the 'name' box and ignore the value in 'mean' box for compatibility. This specification defines additional itunes-style metadata for the Replay Gain originator code: Replay Gain may include originator code information. A mean box shall be present within the ---- box and contain the meaning org.hydrogenaudio.replaygain A name box shall be present within the ---- box and contain the name replaygain_originator_code. The following originator codes shall be used: - = Replay Gain unspecified - 1 = Replay Gain pre-set by artist / producer / mastering engineer - 1 = Replay Gain set by user - 11 = Replay Gain determined automatically The data box shall contain a text string representing the concatenation of the 3-bit originator codes for the Replay Gain values in the following order: - replaygain_track_gain - replaygain_track_peak - replaygain_album_gain Page 7 of 17
8 - replaygain_album_peak For example, 1111 maps to automatically generated values for track gain and peak and unspecified values for album gain and peak) At a minimum a file with Replay Gain metadata shall include a track gain value. Appendix B provides an example for a complete Replay Gain data set in itunes style metadata Replay Gain in ID3v2 tags Replay Gain values are stored in TXXX fields which follow the following syntax: <Header for 'User defined text information frame', ID: "TXXX"> Text encoding $xx Description <string according to encoding> $ () Value <string according to encoding> Each Replay Gain parameter is contained in its own specific TXXX element. To distinguish parameters, the Description string takes the same values as written in the itunes name box (see above) - replaygain_track_gain - replaygain_track_peak - replaygain_album_gain - replaygain_album_peak - replaygain_originator_code The parameter value is stored in the Value field. It uses the same format as described in the itunes section above Matching the Replay Gain semantics In the Replay Gain proposal [7] a gain based on a loudness measure is derived. This gain is designed to adjust the music loudness to the loudness of pink noise at -2 db relative full scale played back over stereo loudspeakers. The associated sound pressure level is 83dB SPL. No formal study on Replay Gain performance for loudness alignment is available. The Replay Gain procedure comprises the following steps: - Frequency weight all channels according to an average inverse equal loudness curve -Compute power for non-overlapping blocks of 5 ms lengths and average over channels for stereo content -Compute the block power which is exceeded by 5% of all blocks per music track and derive the loudness in db relative full scale The gain is computed in db as [ReferenceLoudnessloudness] so that the loudness of the pink noise reference is matched. In order to support existing Replay Gain metadata in addition to ITU-based loudness for both the proposed playback solution as well as for existing Replay Gain compatible audio players, a reversible conversion is required. The goal is to achieve consistent loudness across files potentially with both types of loudness information and to benefit from ITU-based loudness where available. 4 To achieve this goal both leveling approaches have been investigated statistically by means of a music data base which is described in section Comparing ITU-R BS.177 and Replay Gain Both leveling approaches measure a frequency weighted power. The main differences are the filter characteristics and the statistical power analyses from which the loudness is derived. Based on the loudness measure and a fixed target loudness a gain value is derived in the Replay Gain procedure. No target loudness is defined in the ITU recommendation. While ITU-R BS.177 applies a high-pass filter characteristic the Replay Gain filter looks more like a band-pass 5 as shown in Figure 3. 4 ITU-R BS.177 is described in more detail in section Care has to be taken with ITU-R BS.177 if signal energy is present above the frequency hearing range Page 8 of 17
9 Relative Level (db) On average.6% of the music track lengths are identified as silent. Files at low loudness seem to have longer silent periods than louder files (.3% at -5LKFS, 1.2% at -3LKFS). Still 95% of all files around -3 LKFS have less than 5.3% silence. The data base genre focus is on Pop, Rock and Alternative music with some Jazz and Electronic and very little Classical music. Figure 4 shows the distribution of genres according to metadata associated with the files Frequency (Hz) Figure 3 - ITU-R BS.177 Frequency Weighting (solid) vs. Replay Gain Frequency Weighting (dashed) In ITU-R BS.177 signal energy is averaged over the complete music track potentially including silence which typically does not contribute to the subjective loudness. In our investigations silence is therefore excluded from the measurement. However since the music data base does not exhibit a significant amount of silence the impact is limited. Replay Gain on the other hand measures the frame power that is exceeded by only 5% of all frame powers which is near the maximum frame power. As will be shown in section the relation between this near maximum power and the long term power as applied in ITU-R BS.177 is non-linear especially for higher loudness ranges where the dynamic range of the music tends to be reduced The Music Data Base The music data base consists of 2122 stereo files originating from different private music collections. The data base is considered as statistically relevant for today s typical stereo music content for playback on portable devices. Compression formats are mp3 and AAC at various bitrates and sample rates between 32 and 48 khz. Replay Gain has been calculated for all files using the freely available software aacgain (version 1.8.) [8]. Loudness according to ITU-R BS.177 has been computed using an internally developed software implementation (excluding silence 6 ). 6 Silence is identified when the maximum peak level relative to full scale remained below -6 dbfs for more than one second and less than ten seconds Number of Files Pop&Rock Alternative Jazz&Funk Electronic Folk Metal Rythm&Blues World Country Soundtrack Hip-Hop Classical Vocal Figure 4 - Distribution of Genres in the Music Data Base Loudness varies depending on genre for up to 1 LKFS on average. Classical music has the lowest average loudness level at about -2 LKFS as can be seen in Figure 5. 7 The metadata was either part of files acquired through online distribution channels such as Amazon, or was added when ripping CDs by means of databases like CDDB. Reggae Disco Unknown Page 9 of 17
10 Loudness (LKFS) The overall loudness histogram and distribution are shown in Figures 8 and 9. The data base also includes the Top 75 Bestseller mp3 downloads from as of December 29. The mean loudness for this content is -8.5 LKFS Pop&Rock Alternative Jazz&Funk Electronic Folk Metal Rythm&Blues World Country Soundtrack Hip-Hop Classical Vocal Reggae Disco Loudness (LKFS) Figure 5 - Mean Loudness (ITU-R BS.177) and Standard Deviation for different Music Genres. According to the available metadata about half of all music files are no older than year 21 as can be seen in Figure 6. There is a tendency for increasing loudness from the early 9 s on until today as can be seen in Figure Figure 7 - Mean Loudness (ITU-R BS.177) and Standard Deviation vs. Year Number of Files Unknown Loudness (LKFS) Figure 6 - Distribution of Year Information Figure 8 - Loudness Histogram for the Music Data Base 8 As with all the metadata used during this study, it can not be guaranteed that it always reflects reality. In case of the Year metadata, this typically reflects the year of the original recording. However, some CD s might have been re-mastered at a later point in time with actual changes to the overall loudness level. Page 1 of 17
11 1 the general least mean squares fit with respect to the observed error variance. 9 8 Distribution (%) Loudness (LKFS) Figure 9 - Loudness Distribution for the Music Data Base Conversion between Replay Gain and ITUbased loudness Although Replay Gain and ITU-based loudness share common types of processing such as frequency weighting and power measurement their relation is highly complex and content dependent. Therefore a relation shall be derived statistically from the investigated music data base. Since Replay Gain describes a gain in db and ITU-based loudness describes a level relative to full scale the basic relation is inverse. In order to maintain the nature of the Replay Gain proposal the relation shall ideally be modeled as an inverse linear relation. This way a loudness change of for example -1 LKFS will translate into an estimated Replay Gain change of +1 db. A Replay Gain compatible player would then directly benefit from the ITU-based Replay Gains by providing a loudness matching solution based on an industry wide accepted standard. Therefore the task is to find an ideal offset for a straight line fit between both types of loudness data with a slope of -1 which is shown in Figure 1 as a red solid line. For comparison the general least mean squares fit is also shown as a red dashed line with an estimated slope of The standard deviation of the Replay Gain estimation error for this fit is 1. db. Since more than 75% of all music files have loudness above -15 LKFS (see Figure 9), the fit with the slope of -1 is considered a reasonable approximation to 9 Linear regression has been applied to the most relevant data between -31 and LKFS. Figure 1 - Replay Gain vs. ITU-R BS.177 Loudness for 2212 music files. Therefore the following reversible conversion between Replay Gain and ITU-based loudness is proposed: RG = 18 db L (2), where RG is the estimated Replay Gain and L is the Loudness according to ITU-R BS.177 in LKFS (db relative full scale). Number of Files True Replay Gain - Estimated Replay Gain (db) Figure 11 - Replay Gain Estimation Error Histogram for the Conversion according to (2) and Loudness >= -31 LKFS The optimum slope of -.81 indicates a dependency on loudness. This can be explained by the loudness Page 11 of 17
12 dependent level difference between the near maximum frame power (Replay Gain) and the long term power average (ITU-R BS.177). Due to dynamic range processing and limiting this difference tends to get smaller for louder music. This observation is supported by an experiment where in the Replay Gain procedure the near maximum frame power analysis was replaced by a long term power average. Then the least squares fit between this modified Replay Gain and the ITU-based loudness has indeed a slope of -1 as can be seen in Figure full-length (audio/video) television programs and movies for analysis and comparison against the music database. The majority of the AV content was acquired directly from the programmers/studios (prior to transmission encoding) and included both stereo and multichannel content. The red distribution in Figure 13 displays the loudness distribution of the 116 programs in terms of ITU-R BS.177 Loudness (LKFS). % Loudness (LKFS) Figure 13 - Loudness Distributions from 116 Full-length Television Program & Movie Files (red, left) & 2212 Music Files (blue, right) Figure 12 - Modified Replay Gain (use long term power) vs. Loudness (ITU-R BS.177) for all music files. The red line is the general least mean squares straight line fit and has a slope of Recommended playback system In order to implement a playback system supporting the proposed method for loudness normalization, three principal design decisions need to be made: 1. What is the desired target level? 2. How is the dynamic range controlled? 3. How are files treated that do not include any loudness metadata? Loudness Distributions for Music & AV Content As stated earlier, consumers are continuing to diversify the types of content they carry on their portable media players. Considering this fact, the authors also acquired The loudness range of the AV content does span a very wide range of 2dB with the majority of items spanning a 7dB range from ~ -22 LKFS to ~ - 29 LKFS. In contrast, the blue music distribution in Figure 13 displays the loudness distribution for the 2212 music files used in our research. Given that there is very little overlap between the loudness distributions of the two data sets it is quite clear that consumers would be forced to adjust the volume (significantly) when switching between AV and music content. One option to reduce the need for frequent listener adjustment would be to apply a static attenuation of ~ 14dB to the music items (in the figure above) only. Using this approach, the two data sets would no longer have a bimodal characteristic and thus more likely to be pleasing to the listeners. However, utilizing this approach for portable devices would likely yield a significant reduction in output level drive to the internal speakers and/or headphones. Moreover, considering that these devices will likely be used in noisy environments limits (and likely eliminates) the practical usability of this approach. Hence, it is the AV loudness (content) distribution that must be adjusted (in the positive direction) to overlap the music distribution to address this issue. Considering that the production philosophies of television and movie Page 12 of 17
13 content typically produce a lower average loudness (leaving headroom for music and effects to make a dramatic impact) a new common reference level must be established for devices with severe level and dynamic range limitations that does not require portable device specific content preparation and simulcast for these devices. Hence, playback devices of this type can easily take advantage of either coded audio bitstream and/or container metadata to normalize any/all content to this newly defined and common reference loudness level. Furthermore, if the content can be played back via a higher quality reproduction system (directly from the portable device) a different reference loudness level can be selected by the user and/or automatically (default) by the decoder in the higher quality reproduction system. This approach would allow wider and/or the original dynamic range to be presented to the listener and more importantly a loudness and dynamic range that is typical for this type of system/device, etc Choosing an appropriate target level Portable media players (like mobile phones, dedicated personal music players, or laptops) often need to support up to three listening environments: built-in speakers headphone output line output used in combination with a cradle (either analog or digital; newer devices might even support multi-channel output) For the later use case which most often means that the device is connected to Hifi-equipment a lower target level of -31 LKFS as specified for example in line mode for AC-3 and E AC-3 is most appropriate, enabling full dynamic range capabilities. For noisy environments a higher target playback level in combination with high quality limiting and suitable dynamic range compression will provide a better listening experience. At the same time level jumps from content with varying loudness are avoided. The dynamic range compression profile may be adapted to the specific use-case. Built-in speakers or open headphones in noisy environments may require moderate dynamic range compression. Spoken content such as in audio books or podcasts under noisy environmental conditions may require a more aggressive compression profile for optimum speech intelligibility. In order to find a useful target playback level, the loudness for all music files (see section for details) has been analyzed statistically (see Figures 8, 9, and 13). The goal is to find a target playback level which a) does not require excessive analog gain for sufficient loudness, b) needs only moderate limiting for typical content. A natural choice is the average level of typical content without any additional leveling or dynamic range compression. It turns out that this choice also fulfills both requirements mentioned above. As a side effect if the leveling system is switched off severe loudness jumps are unlikely since many files already are at or near the target playback level. In order to be adaptable to the actual environment and playback device, the authors propose 3 different target levels around the measured median loudness namely -8, -11 (the recommended default), and -14 LKFS Combining loudness normalization and dynamic range control Given that the lowest supported loudness value is -31 LKFS, all target levels above -31 LKFS need to support clipping prevention through dynamic range control. Hence, at a minimum portable media players supporting this proposal need to provide a limiter. Formats that support metadata for dynamic range control such as AC-3, E AC-3, or HE AAC can also apply such metadata prior to the signal being fed into the limiter. For example, running an AC-3 or E AC-3 decoder in RF-mode with a target level of -2 LKFS will then require an additional boost of 9 db followed by a limiter. This combination results in a new reference decoder level of -11 LKFS and thus defines a new decoder operating mode for metadata based coding systems. This new mode is referred to as Portable Mode and is an addition to the two legacy decoder operating modes within the AC-3 and E AC-3 systems. (i.e. Line & RF Modes) and could easily accommodate the use of a 3 rd set of dynamic range control gains (computed in the encoder) and carried in the bitstream for clip protection while in Portable Mode. Obviously, the quality of the single-sided limiter is crucial. Dolby Laboratories for example developed a look-ahead limiter with signal-dependent attack and release times, which is able to prevent clipping even for critical (e.g. dynamic) content without any audible artifacts. Page 13 of 17
14 3.3.4 Integrating files without any loudness metadata When preparing for playback of a file, a device first needs to check, whether or not a Replay Gain value is available. In cases where a complete album is being played back, the authors recommend to prefer the album-gain over track-gain. Otherwise, using the trackgain should be the default. In cases where no Replay Gain value is available the system should check for the presence of a formatdependent loudness value such as the dialnorm parameter in AC-3 and E AC-3 or the program reference level in MPEG HE AAC and then use that value instead. If neither Replay Gain nor format-dependent loudness values are available, the authors recommend using a default loudness value of -11 LKFS for stereo music content (the median of all content measured during this study) and -23 LKFS for AV and multi-channel content Overview of the complete playback system The following page provides a graphical overview of the complete system (Figure 14) as well as a pseudocode implementation of the playback control to further illustrate our proposal. 4 SUMMARY AND CONCLUSIONS In this paper the authors motivated a new system for loudness normalization in portable media players which satisfies the expectations of key stakeholders. The building blocks of the system have been described in detail and design decisions were explained. First products supporting the new method both on encode as well as decode side have been released by Dolby Laboratories already 1. The authors hope to convince more players on the market to support this approach. In the end this will increase satisfaction of consumers, content owners, and device manufacturers alike and will solve one of today s industry wide 1 Dolby Media Generator is a file-based content creation tool supporting the proposed Replay Gain syntax and semantic. Dolby Mobile 3 includes support for the so called portable mode which implements loudness normalization in line with this proposal. problems: Loudness normalization in portable media players. 5 ACKNOWLEDGEMENTS The authors would like to acknowledge David Robinson, the author of the original Replay Gain proposal. We applaud his efforts towards resolving a very common problem with consistent music playback among large music libraries on personal computers. 6 REFERENCES [1] _Shared_Assets/English_PDFs/Professional/18_Me tadata.guide.pdf, last visited [2] Katz, B., Mastering Audio, The Art and The Science, 2nd edition. Focal Press, 27 [3] Katz, B., Integrated Approach to Metering, Monitoring, and Leveling Practices, AES Journal, Vol 48, No. 9, 2 September [4] last visited [5] Lund, T., Control of Loudness in Digital TV, NAB Proceedings 26 [6] last visited [7] last visited [8] last visited [9] ITU-R Rec BS.177, Algorithms to measure audio programme loudness and true-peak audio level. [1] Advanced Televisions Systems Committee, Inc.: ATSC Recommend Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television. Document A/85: 29, 4 November 29. [11] VOD-CEP2.-I pdf (l.ast visited ) [12] Soulodre, G. A., Evaluation of Objective Loudness Meters, 116th AES Convention, May 24 Page 14 of 17
15 PCM Metadata (itunes Style/ID3v2 Tag) Encoded Audio Payload Loudness Measurement based on ITU-R BS.177 Target Loudness Adjustement Music Mode: -8, -11, -14 LKFS Line/RF-Mode: -31/ -2 LKFS Content Creation Playback Device PCM Input Loudness Measurement based on ITU-R BS.177 Conversion from Loudness to Replay Gain File Conversion from Replay Gain to Loudness Leveling & Limiting PCM Output Audio Encoder Audio Decoder PCM (optionally with DRC metadata applied) Figure 14 - Overview of the complete system proposal /////////////////////////////////////// // Leveling and Limiting pseudo control // code in the playback device /////////////////////////////////////// // Determine target level (see 3.3.2): Switch ( currentlisteningenvironment() ) { Case builtinspeaker: TargetLevel = -8; // LKFS break; Case lineout: TargetLevel = -31; // LKFS break; default: TargetLevel = -11; // LKFS; // -14 for high-end phones with // good amplifiers break; // Determine reference level (see 3.3.4): If ( replaygainpresent() ) { Switch ( currentplaybackmode() ) { Case albumplayback: If ( albumgainpresent() ) { ReplayGain = getalbumgain(); else { ReplayGain = gettrackgain(); Break; default: If ( trackgainpresent() ) { ReplayGain = gettrackgain(); else { ReplayGain = getalbumgain(); Break; ReferenceLevel = -ReplayGain-18; // see else { If ( levelinfoinpayload() ) ReferenceLevel = getlevelinfofrompayload(); else { // Default levels Switch ( currentmusicformat ) { Case stereomusic: ReferenceLevel = -11; //LKFS Break; default: ReferenceLevel = -23; //LKFS Break; // The following signal processing // functions may be combined in a single // gain-application-stage. The pseudo-code // only provides a conceptual view. // Apply dynamic range compression // if available(see 3.3.3) If ( metadataavailable() && TargetLevel >= -2) { applymetadata(rf_mode); // e.g. AC-3 or E AC-3 ReferenceLevel = -2; // LKFS // data is leveled to -2 LKFS in RF-mode // Apply final gain/limiter Gain = TargetLevel ReferenceLevel; applygain(); If ( Gain > ) { applylimiterandgain(); Page 15 of 17
16 7 APPENDIX A DECODER OPERATING MODES AC-3 & E AC-3 Decoder - Line Mode Operation: Line Mode operation generally applies to the baseband line level outputs from two-channel set-top decoders, two-channel digital televisions and multichannel Home Theatre decoders. It is important to note that Line Mode operation is a requirement for all digital set top boxes that have analog baseband (line level) outputs. With respect to consumer type applications, a decoder s outputs operating in this mode will typically be connected to a much higher quality sound reproduction system than that found in a typical television set. In this mode, dialogue normalization is enabled and applied in the decoder at all times. Furthermore, in this mode the normalized level of dialogue is reproduced at a level of -31 LKFS, but ONLY when the transmitted dialnorm value has been correctly adjusted/provisioned for the particular program. In general, with the reproduced dialogue level at 31 LKFS, this mode allows wide dynamic range programming to be reproduced without any peak limiting and/or compression applied as may be intended by the original program producers. And since the AC-3 and E AC-3 systems can provide more than 1dB of dynamic range, there is no technical reason to encode the dialogue peaks at or near 1% as is commonly practiced in analog television broadcasts. This allows the AC-3 and E AC-3 systems to meet one of its goals of being able to deliver high impact cinema type sound to the digital subscriber s living room. AC-3 and E AC-3 Decoder - RF Mode Operation: RF Mode is intended for products such as terrestrial, cable and satellite set-top boxes that generate a monophonic and/or downmixed signal for transmission via the channel 3/4 remodulated output that feeds the RF (antenna) input of a television set. This mode was specifically designed to match the average reproduced dialogue levels and dynamic range of digital sources to those of existing analog sources such as NTSC and analog cable TV broadcasts. In this operating mode dialogue normalization is enabled and applied in the decoder at all times. However, the dialogue level in this mode is reproduced at a level of -2 LKFS ONLY when the transmitted dialnorm value is valid for a particular program. In this mode, AC-3 and E AC-3 decoders introduce an +11 db gain shift and thus the maximum possible peak to dialog level ratio is reduced by 11 db. This is achieved by compression and limiting internal to the AC-3 and E AC-3 decoders (but is calculated in the encoder). It is important to note that digital set top boxes which include an RF modulator are required to provide and default to RF Mode operation. Figure 15 compares the reference loudness level relationships in a decoder operating in Line, RF & and the newly defined Portable operating modes. Notice the reproduced loudness level and dynamic range available each mode. Figure 15 - Decoder Operating Mode(s) & Target Reference Levels Page 16 of 17
17 8 APPENDIX B ITUNES-STYLE REPLAY GAIN METADATA Below is an example for the Replay Gain compatible itunes metadata to further illustrate the proposed syntax: Page 17 of 17
FREE TV AUSTRALIA OPERATIONAL PRACTICE OP- 59 Measurement and Management of Loudness in Soundtracks for Television Broadcasting
Page 1 of 9 1. SCOPE This Operational Practice is recommended by Free TV Australia and refers to the measurement of audio loudness as distinct from audio level. It sets out guidelines for measuring and
All About Audio Metadata. The three Ds: dialogue level, dynamic range control, and downmixing
Information All About Audio Metadata Metadata, the data about the audio data that travels along with the multichannel audio bitstream in Dolby Digital, makes life easier for broadcasters while also increasing
FREE TV AUSTRALIA OPERATIONAL PRACTICE OP 60 Multi-Channel Sound Track Down-Mix and Up-Mix Draft Issue 1 April 2012 Page 1 of 6
Page 1 of 6 1. Scope. This operational practice sets out the requirements for downmixing 5.1 and 5.0 channel surround sound audio mixes to 2 channel stereo. This operational practice recommends a number
Loudness and Dynamic Range
Loudness and Dynamic Range in broadcast audio the Dolby solution Tony Spath Dolby Laboratories, Inc. Digital delivery media offer a wider dynamic range for audio than their analogue predecessors. This
Convention Paper 7896
Audio Engineering Society Convention Paper 7896 Presented at the 127th Convention 2009 October 9 12 New York, NY, USA The papers at this Convention have been selected on the basis of a submitted abstract
RECOMMENDATION ITU-R BR.1384-1*, ** Parameters for international exchange of multi-channel sound recordings with or without accompanying picture ***
Rec. ITU-R BR.1384-1 1 RECOMMENDATION ITU-R BR.1384-1*, ** Parameters for international exchange of multi-channel sound recordings with or without accompanying picture *** (Question ITU-R 58/6) (1998-2005)
Multichannel stereophonic sound system with and without accompanying picture
Recommendation ITU-R BS.775-2 (07/2006) Multichannel stereophonic sound system with and without accompanying picture BS Series Broadcasting service (sound) ii Rec. ITU-R BS.775-2 Foreword The role of the
5.1 audio. How to get on-air with. Broadcasting in stereo. the Dolby "5.1 Cookbook" for broadcasters. Tony Spath Dolby Laboratories, Inc.
5.1 audio How to get on-air with the Dolby "5.1 Cookbook" for broadcasters Tony Spath Dolby Laboratories, Inc. This article is aimed at television broadcasters who want to go on-air with multichannel audio
TECHNICAL OPERATING SPECIFICATIONS
TECHNICAL OPERATING SPECIFICATIONS For Local Independent Program Submission September 2011 1. SCOPE AND PURPOSE This TOS provides standards for producing programs of a consistently high technical quality
Engineering Bulletin
Level Magic Process Control Level ITU-BS.1770-1 (A/85:2011) The Junger Audio proprietary, level based process. The aim is to maintain a desired operating level. Curves and algorithms are the intellectual
Technical Paper. Dolby Digital Plus Audio Coding
Technical Paper Dolby Digital Plus Audio Coding Dolby Digital Plus is an advanced, more capable digital audio codec based on the Dolby Digital (AC-3) system that was introduced first for use on 35 mm theatrical
LOUDNESS MONITORING AND CALM COMPLIANCE (UNFOLDING EVENTS) Jim Welch. IneoQuest Technologies
LOUDNESS MONITORING AND CALM COMPLIANCE (UNFOLDING EVENTS) Jim Welch IneoQuest Technologies Topics What s the issue? What s the history? What are the new rules? What are the key concepts? What are the
How To Control Loudness On A Tv Or Radio
ATSC Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television (A/85:2013) Doc. A/85:2013 12 March 2013 Advanced Television Systems Committee 1776 K Street,
ATSC Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television
ATSC Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television Document A/85:2009, 4 November 2009 Advanced Television Systems Committee, Inc. 1776 K Street,
Multichannel Audio Line-up Tones
EBU Tech 3304 Multichannel Audio Line-up Tones Status: Technical Document Geneva May 2009 Tech 3304 Multichannel audio line-up tones Contents 1. Introduction... 5 2. Existing line-up tone arrangements
Loudness Normalization: The Future of File-Based Playback
Page 1 Loudness Normalization: The Future of File-Based Playback Music Loudness Alliance 1 http://music-loudness.com Version 1.0, July 2012 Today, playback on portable devices dominates how music is enjoyed.
R 128 LOUDNESS NORMALISATION AND PERMITTED MAXIMUM LEVEL OF AUDIO SIGNALS
R 128 LOUDNESS NORMALISATION AND PERMITTED MAXIMUM LEVEL OF AUDIO SIGNALS Status: EBU Recommendation Geneva June 2014 Page intentionally left blank EBU R 128-2014 Audio loudness normalisation & permitted
Dolby DP600 and DP600-C Program Optimizer Overview for Postproduction Facilities
Dolby DP600 and DP600-C Program Optimizer Overview for Postproduction Facilities The Dolby DP600 Program Optimizer is an innovative audio platform that provides a file-based work-flow solution for loudness
MPEG-H Audio System for Broadcasting
MPEG-H Audio System for Broadcasting ITU-R Workshop Topics on the Future of Audio in Broadcasting Jan Plogsties Challenges of a Changing Landscape Immersion Compelling sound experience through sound that
Direct Digital Amplification (DDX ) The Evolution of Digital Amplification
Direct Digital Amplification (DDX ) The Evolution of Digital Amplification Tempo Semiconductor 2013 Table of Contents Table of Contents... 2 Table of Figures... 2 1. DDX Technology Overview... 3 2. Comparison
Dolby DP600 and DP600-C Program Optimizer Overview for Cable and IPTV Operators
Dolby DP600 and DP600-C Program Optimizer Overview for Cable and IPTV Operators The Dolby DP600 Program Optimizer is an innovative audio platform that provides a file-based work-flow solution for loudness
BASS MANAGEMENT SYSTEM
E BASS MANAGEMENT SYSTEM 1 What is Bass Management? In stereo reproduction, signals from 20Hz to 20kHz need to be replayed. Multi-way speaker systems will reproduce that bandwidth evenly. With multichannel
Current Status and Problems in Mastering of Sound Volume in TV News and Commercials
Current Status and Problems in Mastering of Sound Volume in TV News and Commercials Doo-Heon Kyon, Myung-Sook Kim and Myung-Jin Bae Electronics Engineering Department, Soongsil University, Korea [email protected],
high-quality surround sound at stereo bit-rates
FRAUNHOFER Institute For integrated circuits IIS MPEG Surround high-quality surround sound at stereo bit-rates Benefits exciting new next generation services MPEG Surround enables new services such as
DAB + The additional audio codec in DAB
DAB + The additional audio codec in DAB 2007 Contents Why DAB + Features of DAB + Possible scenarios with DAB + Comparison of DAB + and DMB for radio services Performance of DAB + Status of standardisation
Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles
Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles Sound is an energy wave with frequency and amplitude. Frequency maps the axis of time, and amplitude
FREQUENTLY ASKED QUESTIONS ABOUT DOLBY ATMOS FOR THE HOME
FREQUENTLY ASKED QUESTIONS ABOUT DOLBY ATMOS FOR THE HOME August 2014 Q: What is Dolby Atmos? Dolby Atmos is a revolutionary new audio technology that transports you into extraordinary entertainment experiences.
Creating Content for ipod + itunes
apple Apple Education Creating Content for ipod + itunes This guide provides information about the file formats you can use when creating content compatible with itunes and ipod. This guide also covers
Convention Paper Presented at the 112th Convention 2002 May 10 13 Munich, Germany
Audio Engineering Society Convention Paper Presented at the 112th Convention 2002 May 10 13 Munich, Germany This convention paper has been reproduced from the author's advance manuscript, without editing,
Overview. Dolby PC Entertainment Experience v4
Dolby PC Entertainment Experience v4 Overview From humble beginnings as a productivity device, the personal computer has evolved into an entertainment hub for the digital lifestyle. People are increasingly
TV 2 AS HD DELIVERY FOR TV 2 AS
TV 2 AS HD DELIVERY FOR TV 2 AS Version 1.0 Contents 1. Delivery Format... 4 2. Standard Definition... 4 2.1. Standard definition... 4 3. Video Standards... 4 4. Action and Caption safe areas... 5 5. Audio...
Loudness. The second line of defence. don t forget the distribution chain! Richard van Everdingen Delta Sigma Consultancy
Loudness don t forget the distribution chain! Richard van Everdingen Delta Sigma Consultancy EBU Recommendation 128 is a milestone in the history of audio broadcasting. It started a loudness revolution
Multichannel stereophonic sound system with and without accompanying picture
Recommendation ITU-R BS.775-3 (08/2012) Multichannel stereophonic sound system with and without accompanying picture BS Series Broadcasting service (sound) ii Rec. ITU-R BS.775-3 Foreword The role of the
Audio Engineering Society. Convention Paper. Presented at the 127th Convention 2009 October 9 12 New York, NY, USA
Audio Engineering Society Convention Paper Presented at the 127th Convention 2009 October 9 12 New York, NY, USA The papers at this Convention have been selected on the basis of a submitted abstract and
Operation Manual. LM-Correct. Loudness Measurement & Correction. 2015 NUGEN Audio
Operation Manual LM-Correct Loudness Measurement & Correction 2015 NUGEN Audio Contents Page Introduction 3 Interface 4 General Layout 4 Parameter description 6 Practical operation (quick start guide)
RECORDING AND CAPTURING SOUND
12 RECORDING AND CAPTURING SOUND 12.1 INTRODUCTION Recording and capturing sound is a complex process with a lot of considerations to be made prior to the recording itself. For example, there is the need
DTS-HD Audio. Consumer White Paper for Blu-ray Disc and HD DVD Applications
DTS-HD Audio Consumer White Paper for Blu-ray Disc and HD DVD Applications Summary and overview of DTS-HD Audio: data rates, sampling frequencies, speaker configurations, low bit rate and lossless extensions
Audio Loudness Normalization and Measurement of Broadcast Programs
Audio Loudness Normalization and Measurement of Broadcast Programs Martin Simaliak, Miroslav Uhrina, Jan Hlubik, Martin Vaculik, Martin Cap Department of Telecommunication and Multimedia Faculty of Electrical
How To Test Video Quality With Real Time Monitor
White Paper Real Time Monitoring Explained Video Clarity, Inc. 1566 La Pradera Dr Campbell, CA 95008 www.videoclarity.com 408-379-6952 Version 1.0 A Video Clarity White Paper page 1 of 7 Real Time Monitor
Receiver Customization
9242_13_Ch11_eng 6/11/07 9:36 AM Page 1 Receiver Customization PERSONALIZING YOUR SATELLITE RECEIVER Take a look through this chapter and you ll find out how to change settings on the receiver to make
Digital terrestrial television broadcasting Audio coding
Digital terrestrial television broadcasting Audio coding Televisão digital terrestre Codificação de vídeo, áudio e multiplexação Parte 2: Codificação de áudio Televisión digital terrestre Codificación
COMMON SURROUND SOUND FORMATS
COMMON SURROUND SOUND FORMATS The range of decoding formats used to deliver the theatre surround sound experience include: Dolby Pro Logic: The Longstanding stereo matrix encoded format. Extracting four
Audio and Video Synchronization:
White Paper Audio and Video Synchronization: Defining the Problem and Implementing Solutions Linear Acoustic Inc. www.linearacaoustic.com 2004 Linear Acoustic Inc Rev. 1. Introduction With the introduction
UNIVERSITY OF CALICUT
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION BMMC (2011 Admission) V SEMESTER CORE COURSE AUDIO RECORDING & EDITING QUESTION BANK 1. Sound measurement a) Decibel b) frequency c) Wave 2. Acoustics
DeNoiser Plug-In. for USER S MANUAL
DeNoiser Plug-In for USER S MANUAL 2001 Algorithmix All rights reserved Algorithmix DeNoiser User s Manual MT Version 1.1 7/2001 De-NOISER MANUAL CONTENTS INTRODUCTION TO NOISE REMOVAL...2 Encode/Decode
Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
A New Multichannel Sound Production System for Digital Television
A New Multichannel Sound Production System for Digital Television Regis R. A. FARIA and Paulo GONÇALVES Abstract As digital television incorporates 3D technologies, the demand for real high-resolution
Frequently asked QUESTIONS. about DOLBY DIGITAL
Frequently asked QUESTIONS about DOLBY DIGITAL Table of Contents 1. What is Dolby Digital?... 1 2. What program sources deliver Dolby Digital audio?... 1 3. Can I hear Dolby Digital programs over a regular
High Quality Podcast Recording
High Quality Podcast Recording Version 1.2, August 25, 2013 Markus Völter ([email protected]) Abstract A good podcast comes with great content as well as great audio quality. Audio quality is as important
RECOMMENDATION ITU-R BS.644-1 *,** Audio quality parameters for the performance of a high-quality sound-programme transmission chain
Rec. ITU-R BS.644-1 1 RECOMMENDATION ITU-R BS.644-1 *,** Audio quality parameters for the performance of a high-quality sound-programme transmission chain (1986-1990) The ITU Radiocommunication Assembly,
The recording angle based on localisation curves
Audio Engineering Society Convention Paper Presented at the 112th Convention 2002 May 10 13 Munich, Germany This convention paper has been reproduced from the author's advance manuscript, without editing,
Receiver Customization
11 Receiver Customization PERSONALIZING YOUR SATELLITE RECEIVER Use the information in this chapter to customize your receiver. USING SHARED VIEW USING CALLER ID CHANGING LANGUAGES USING CLOSED CAPTIONING
WAVES WLM LOUDNESS METER USER GUIDE
WAVES WLM LOUDNESS METER USER GUIDE TABLE OF CONTENTS CHAPTER 1 INTRODUCTION... 3 1.1 WELCOME... 3 1.2 PRODUCT OVERVIEW... 3 1.3 CONCEPTS AND TERMINOLOGY... 3 1.4 COMPONENTS... 5 CHAPTER 2 USING THE WLM...
High Dynamic Range Video The Future of TV Viewing Experience
The Future of TV Viewing Experience - White Paper - www.keepixo.com Introduction The video industry has always worked to improve the TV viewing experience. More than a decade ago, the transition from SD
Hooking Up Home Theater
Home Support Technical Articles Hooking Up Home Theatre What is Home Theater? Hooking Up Home Theater Since the mid- to late-1990s, home theater systems have rapidly grown in popularity, as consumers have
Convention Paper Presented at the 118th Convention 2005 May 28 31 Barcelona, Spain
Audio Engineering Society Convention Paper Presented at the 118th Convention 25 May 28 31 Barcelona, Spain 6431 This convention paper has been reproduced from the author s advance manuscript, without editing,
Datasheet EdgeVision
Datasheet Multichannel Quality of Experience Monitoring Stay in control with customizable monitoring and interfaces. offers richly featured, Quality of Experience (QoE) monitoring across an entire network
Tutorial about the VQR (Voice Quality Restoration) technology
Tutorial about the VQR (Voice Quality Restoration) technology Ing Oscar Bonello, Solidyne Fellow Audio Engineering Society, USA INTRODUCTION Telephone communications are the most widespread form of transport
Red Bee Media. Technical file and tape delivery specification for Commercials. Applies to all channels transmitted by Red Bee Media
Red Bee Media Technical file and tape delivery specification for Commercials Applies to all channels transmitted by Red Bee Media Red Bee Media Rev. November 2010 Introduction Commercial copy is delivered
DTS Enhance : Smart EQ and Bandwidth Extension Brings Audio to Life
DTS Enhance : Smart EQ and Bandwidth Extension Brings Audio to Life White Paper Document No. 9302K05100 Revision A Effective Date: May 2011 DTS, Inc. 5220 Las Virgenes Road Calabasas, CA 91302 USA www.dts.com
Preservation Handbook
Preservation Handbook Digital Audio Author Gareth Knight & John McHugh Version 1 Date 25 July 2005 Change History Page 1 of 8 Definition Sound in its original state is a series of air vibrations (compressions
User Manual. Please read this manual carefully before using the Phoenix Octopus
User Manual Please read this manual carefully before using the Phoenix Octopus For additional help and updates, refer to our website To contact Phoenix Audio for support, please send a detailed e-mail
Doppler Effect Plug-in in Music Production and Engineering
, pp.287-292 http://dx.doi.org/10.14257/ijmue.2014.9.8.26 Doppler Effect Plug-in in Music Production and Engineering Yoemun Yun Department of Applied Music, Chungwoon University San 29, Namjang-ri, Hongseong,
Audio Coding Algorithm for One-Segment Broadcasting
Audio Coding Algorithm for One-Segment Broadcasting V Masanao Suzuki V Yasuji Ota V Takashi Itoh (Manuscript received November 29, 2007) With the recent progress in coding technologies, a more efficient
Home Theater. Diagram 1:
Home Theater What is Home Theater? Since the mid- to late-1990s, home theater systems have rapidly grown in popularity, as consumers have looked for ways to enjoy movies at home the same way they do in
Applications that Benefit from IPv6
Applications that Benefit from IPv6 Lawrence E. Hughes Chairman and CTO InfoWeapons, Inc. Relevant Characteristics of IPv6 Larger address space, flat address space restored Integrated support for Multicast,
Audio/Video Synchronization Standards and Solutions A Status Report. Patrick Waddell/Graham Jones/Adam Goldberg
/Video Synchronization Standards and Solutions A Status Report Patrick Waddell/Graham Jones/Adam Goldberg ITU-R BT.1359-1 (1998) Only International Standard on A/V Sync Subjective study with EXPERT viewers
MXF for Program Contribution, AS-11 AMWA White Paper
for Program Contribution, AS-11 AMWA White Paper By Ian Wimsett, Red Bee Media for Program Contribution, AS-11, is a specification that describes a file format for the delivery of finished programming
Digital Audio Compression: Why, What, and How
Digital Audio Compression: Why, What, and How An Absurdly Short Course Jeff Bier Berkeley Design Technology, Inc. 2000 BDTI 1 Outline Why Compress? What is Audio Compression? How Does it Work? Conclusions
GUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS
GUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS Digitization Best Practices for Audio This document sets forth guidelines for digitizing audio materials for CARLI Digital Collections. The issues described
Specification of the Broadcast Wave Format (BWF)
EBU TECH 3285 Specification of the Broadcast Wave Format (BWF) A format for audio data files in broadcasting Version 2.0 Geneva May 2011 1 * Page intentionally left blank. This document is paginated for
White paper. H.264 video compression standard. New possibilities within video surveillance.
White paper H.264 video compression standard. New possibilities within video surveillance. Table of contents 1. Introduction 3 2. Development of H.264 3 3. How video compression works 4 4. H.264 profiles
English Manual. LM2n / LM6n
English Manual LM2n / LM6n Introduction 1 About this manual 2 Getting support 2 System requirements and installation 3 System requirements 4 Hosts 4 Installation 4 Using LM2n / LM6n 8 LM2n / LM6n features
RECOMMENDATION ITU-R BO.786 *
Rec. ITU-R BO.786 RECOMMENDATION ITU-R BO.786 * MUSE ** system for HDTV broadcasting-satellite services (Question ITU-R /) (992) The ITU Radiocommunication Assembly, considering a) that the MUSE system
Important HP Media Center PC Updates
Important HP Media Center PC Updates Your system uses Microsoft Windows XP Media Center Edition 2005. Before starting the system and using the Media Center setup wizard, please read this updated information
Sound System Buying Guide
Sound System Buying Guide The Guide format is like an FAQ (frequently asked questions), with the answers taken together forming a great basis for picking your favorite components. With this information,
Understanding the DriveRack PA. The diagram below shows the DriveRack PA in the signal chain of a typical sound system.
Understanding the DriveRack PA The diagram below shows the DriveRack PA in the signal chain of a typical sound system. The diagram below shows all the pieces within a sound system that the DriveRack PA
bel canto The Computer as High Quality Audio Source A Primer
The Computer as High Quality Audio Source A Primer These are exciting times of change in the music world. With CD sales dropping and downloaded MP3 music, streamed music and digital music players taking
Open Source Software for loudness measurement
RMLL 2013 July 10, 2013 Welcome! What is loudness measurement? Projects Manager at France Télévisions, the french public television broadcaster. Working on loudness since 2011 (E.B.U PLOUD, French Working
Polycom Video Communications
Polycom Video Communications Advanced Audio Technology for Video Conferencing Author: Peter L. Chu September 2004 Connect. Any Way You Want. If you ask frequent users of video conferencing technology what
Dirac Live & the RS20i
Dirac Live & the RS20i Dirac Research has worked for many years fine-tuning digital sound optimization and room correction. Today, the technology is available to the high-end consumer audio market with
Figure1. Acoustic feedback in packet based video conferencing system
Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents
Case Study: Real-Time Video Quality Monitoring Explored
1566 La Pradera Dr Campbell, CA 95008 www.videoclarity.com 408-379-6952 Case Study: Real-Time Video Quality Monitoring Explored Bill Reckwerdt, CTO Video Clarity, Inc. Version 1.0 A Video Clarity Case
WHITE PAPER. Overview...2. From the Studio to the Palm of Your Hand...4. Improving Content Quality from Any Source...4
WHITE PAPER Dolby Digital Plus for Mobile Devices Overview...2 From the Studio to the Palm of Your Hand...4 Improving Content Quality from Any Source...4 Volume Leveler... 4 Intelligent Equalizer... 5
MAPS - Bass Manager. TMH Corporation 2500 Wilshire Blvd., Suite 750 Los Angeles, CA 90057 ph: 213.201.0030 fx: 213.201.0031 www.tmhlabs.
MAPS - Bass Manager TMH Corporation 2500 Wilshire Blvd., Suite 750 Los Angeles, CA 90057 ph: 213.201.0030 fx: 213.201.0031 www.tmhlabs.com Thankyou for purchasing the TMH Bass Manager. By this purchase
Conference interpreting with information and communication technologies experiences from the European Commission DG Interpretation
Jose Esteban Causo, European Commission Conference interpreting with information and communication technologies experiences from the European Commission DG Interpretation 1 Introduction In the European
Mid-Range Complete Radio Station Package
Equipment, Training and Technical Services for Community Radio www.radioactive.org.uk [email protected] Tel: +44 (0) 207 3723815 Fax: +44 (0) 20 89927115 Mid-Range Complete Radio Station Package
RECOMMENDATION ITU-R BS.704 *, ** Characteristics of FM sound broadcasting reference receivers for planning purposes
Rec. ITU-R BS.704 1 RECOMMENDATION ITU-R BS.704 *, ** Characteristics of FM sound broadcasting reference receivers for planning purposes (1990) The ITU Radiocommunication Assembly, considering a) that
Preserving music from LPs and tapes
Preserving music from LPs and tapes Roxio Edit Audio Tags Assistant Roxio LP and Tape Assistant makes it easy to connect your LP or tape player to your computer, record music from these analog sources
Target Loudness Level and Dynamic Range
Gerhard Spikofski GmbH Research and Development Institute of ARD, ZDF, Dradio, ORF and SRG/SSR Target Loudness Level and Dynamic Range 3rd International VDT Symposium An Eye on Hearing Schloss Hohenkammer
DELIVERING CAPTIONS IN DTV An NCAM DTV Access Brief
NOTICE: The following information is offered by NCAM solely as a general overview of the current status of closed captioning support in digital television. Features and capabilities of related systems
ATSC Standard: 3D-TV Terrestrial Broadcasting, Part 2 Service Compatible Hybrid Coding Using Real-Time Delivery
ATSC Standard: 3D-TV Terrestrial Broadcasting, Part 2 Service Compatible Hybrid Coding Using Real-Time Delivery Doc. A/104 Part 2 26 December 2012 Advanced Television Systems Committee 1776 K Street, N.W.
A Low-Cost, Single Coupling Capacitor Configuration for Stereo Headphone Amplifiers
Application Report SLOA043 - December 1999 A Low-Cost, Single Coupling Capacitor Configuration for Stereo Headphone Amplifiers Shawn Workman AAP Precision Analog ABSTRACT This application report compares
AIR DRM. DRM+ Showcase
AIR DRM DRM+ Showcase New Delhi, 23 rd 27 th May 2011 The Report was prepared by Digital Radio Mondiale Consortium (DRM) & All India Radio (AIR) Version: 2011-07-26 Index: 1. Introduction 2. DRM+ System
Monitoring Surround-Sound Audio
Monitoring Surround-Sound Audio Introduction A picture may say a thousand words, but without audio to accompany the picture the impact of the material is subdued. In quality test it has been found that
QUALITY AV PRODUCTS INMATE/INMATE USB PROFESSIONAL 19" MIXER. User Guide and Reference Manual
INMATE/INMATE USB PROFESSIONAL " MIXER User Guide and Reference Manual INTRODUCTION Welcome to the NEWHANK INMATE and INMATE USB professional " mixers series user manual. INMATE and INMATE USB both offer
AUDIO/VIDEO MULTI-CHANNEL RECEIVER VSX-D411 VSX-D511
AUDIO/VIDEO MULTI-CHANNEL RECEIVER VSX-D411 VSX-D511 Operating Instructions Thank you for buying this Pioneer product. Please read through these operating instructions so you will know how to operate your
2015 Online Mastering, written by Holger Lagerfeldt. No part of this text may be reproduced without permission.
Mixdown for Mastering Tips 2015 Online Mastering, written by Holger Lagerfeldt. No part of this text may be reproduced without permission. www.onlinemastering.dk/mastering-faq.html Plug-ins on the Master
