,

Loudness in mastering - part 2

Continued from Part 1: https://www.alessandrofois.com/loudness-nel-mastering-parte-1-dinamica/


To prevent the background noise and other disorders inherent in recording media (e.g. the swoosh of analogue tapes) was sought:

  • to keep the maximum peak of the recording at as high a level as possible, but below the point of distortion
  • by compress I "dynamic useful space" in a relatively narrow range, capable of reproducing a functional dynamic range for various types of use but wide enough to decently reproduce the dynamic expressiveness of music

In the following years, particularly in the area of pop, the production industry gradually reduced the dynamic space, compressing it more and more in order to increase the volume of the lowest moments of the performance dynamics, until the dynamic space used was reduced to a few db.

As we shall see, the phenomenon has seriously accelerated with the advent of digital media.

Over the course of some 20 years (from the 1990s to the 10s of the third millennium), the need to compress music to ensure its more convenient enjoyment gradually turned into an unbridled race to Perceivable volume.

The aim, encouraged by the producers, was to 'loudness' the sound impact of competing music productions, which triggered a real Volume Wardefined as 'Loudness War'.

Loudness war

The term loudness war therefore refers to the tendency of the music industry, fuelled by artists and producers, to produce and release music using high levels of loudnesswhich became higher and higher year after year, in a continuous attempt to outperform in volume the productions released by 'competing' artists and record labels.

The introduction of the processors of digital signals and limiters of better quality and extreme precision allowed the sound engineers to significantly increase the perceived volume in a recording; and since 'stronger' was generally perceived by users as 'better', i sound engineerdue to 'pressure' from producers, they tried to 'push' the volume as far as possible, which led the record industry to the volume war.

Many music operators, especially sound engineers and artists, believe that this trend has led to the sacrifice of the sound quality and the dynamic expression for obtaining high volume levels at audio support.

Above are the waveforms of a song edited in 1980 and several times remastered in the editions of the following years, with a tendency to increase loudness in 2001 until the peak of the loudness war in 2005, and then return to a high but more moderate intensity in 2011.

The analogue era

This procedure was not used before the advent of digital, partly because of the physical limitations inherent in the mechanical vinyl engraving.

To tell the truth, even with vinyl it happened that some records sounded louder, according to the natural dynamics of the various musical genres and due to the different mastering techniques then in use, but each disc was 'an island unto itself', and the only perceived need was to: 

  • ensure that all tracks on the same album were proportionate in volume to each other, staying within a dynamic space greater than the background noise levelwithout ever distort in the peaks higher and ensuring that such dynamic space was large enough for adynamic-musical expression correct.

In the case of the production of 'compilation' discs (with already released tracks from different albums and sometimes by different artists), a remasteringin order to levelling the volume and tonal balance perceivable during listening of the various tracks, in order to determine a greater 'homogeneity' between the contents of the disc.

In thewas analoguethose who wished to listen 'louder' could simply raise the volume of the amplification of its reproduction system, adjusting the listening volume each time a record is changed on the turntable, so as to adapt it to one's listening needs at the time.

The only limitation was determined by the power of the amplification and the mechanical resistance of the reproduction system's loudspeakers.

With the advent of audio-cassettesthe criterion of use by users did not change substantially, still leaving the volume knob of the individual end user the task of levelling theintensity sound, according to the listener's preference.

The digital age

For a time, CD listening was also characterised by substantially the same routine, and this routine continued for much of the 1980s, which was a decade characterised, however, by a significant increase in the loudnessbut progressive and moderate.

La 'Volume War' proper would seem to have begun in the 1990s, with the spread of multiple compact disc players mounted in cars, which made it possible to switch from one track to another by 'jumping' from one CD to another; this mode of use, in fact, highlighted the differences in volume between one disc and another.

Specifically, the race for volume became more pronounced when manufacturers realised that users with multiple CD players often used a 'free' reading criterion, ranging from one track to another of different CDs, forcing them to continually 'adjust' the listening volume, which was particularly unpleasant while driving a car.

If the user did not change the listening volume, tracks with a larger dynamic range (which were perceived at a lower average volume) were penalised, appearing more 'sparse' in comparison with others that sounded louder.

This realisation was the decisive spring that drove various producers and artists to demand a solution from the sound engineers, namely the exaggerated compression of the master, pushing it to ever higher levels.

The phenomenon continued with the advent of portable players and USB sticks, becoming unsustainable within a few years and provoking the complaints of many sound engineers and artists, who urged the identification of a reference standard, capable of respecting sound quality, music and its dynamics a little more.

Audio Consequences

Since the sound level of an audio file cannot exceed a certain limit (the digital 0 db), the overall volume can only be increased reducing the dynamic range and subsequently "normalising the level of the track" (thus bringing the maximum peak at the maximum tolerance point of digital sampling, i.e. close to the 0 db). 

The above was then achieved by 'compressing upwards' and in an increasingly extreme manner the 'dynamics', with the result of increasingly compromising the peaks and causing acoustic distortions of various kinds and the almost total loss of expressive dynamic modulation.

Negative effects

  • Music with a reduced dynamic range was stressful and not very expressive
  • The excessive shaving of the peaks produced many 'noise points', which became increasingly dense and audible the higher the compression; in the worst cases, it was as if a 'continuous iron-like background noise', similar to 'white noise', had been created

Concrete positive effects

  • Increased usability of sound content when listening in noisy places

N.B.

For years, sound engineers have been forced to 'grasp at straws' in order to comply with the demands of clients. 

In order to minimise the damage induced by exaggerated compression, they have thus learnt to optimise processes as much as possible, including by means of:

  • the use of multi-band compression
  • of 'step-by-step' automation of compression values, 
  • of analogue and valve (or digital with analogue emulation) compression techniques to create more harmonic 'saturation walls'.

But even so, 'sound monsters' have been produced that in the opinion of many are intolerable.

The solutions

Eager to end the volume war, in the late 1990s the sound engineer Bob Katz developed a criterion called K-System.

K-System

The K-System (Katz Bob's System) is a protocol for setting mix and monitor calibrations in an audio studio.

Although the standards of loudness such as EBU R128as we shall see, are more widely used nowadays using a scale in LUFS/dB, the K-System , which uses a scale in RMS/dB is still a good way to adjust audio levels.

This system uses three differentiated standards, known as K-20, K-14 and K-12.

These numbers express in dB RMS the amplitude of the track's dynamic range, so that at each step (from K-20 to K-12), the dynamic range made available decreases and the loudness (understood as the perceived average volume) increases.

The 'label' display at the top of the meter scale should indicate the maximum expected level depending on the target (20 dB or 14 dB or 12 dB) and, just as with normal measurement, correspond to the full-scale digital signal.

In order to work properly, the system requires the monitor's listening level to be carefully calibrated, so that its perceived level, when standing on the label 0 dB of the meter, corresponds to 85 dB SPL.

This is in fact the ideal reference condition for mixing and mastering at K-20, a K-14 and to K-12. 

The K-System simultaneously show both the peak level than that RMS.

The upper red part of the meters is the zone of maximum intensity.

In music recording, the RMS level should only reach the red zone in the most intense passages, at occasional peaks. 

In fact, according to the average of Katz's own test results with  some user samples, it was found that if you find yourself using the red zone all the time, you may feel the need to decrease the monitor gain.

Here are some details of the 3 measurements:

K-12

This level was designed exclusively for radio broadcasting.

This results in -12 dBFS = 0 VU = 85 dB SPL

The headroom limited to only 12 dB explains its exclusive use for compressed audio material to be used only for transmission over the air (although it was later also used for the finalisation of the most demanding music genres, such as dance (especially electronic) and a certain type of pop-music

K-14

This was to be the standard for most commercial pop recordings, created for home and private listening in general

Pop music mixes are examples of material suitable for the K-14, where -14 dBFS = 0 VU = 85 dB SPL.

The headroom margin is 14 dB

The K-14 scale was probably the most widely used of the three standards

K-20

This scale offers the widest dynamic range available among the three systems.

It was primarily designed for large theatre mixes, dynamic music mixes, cinema, television broadcasting, classical and traditional style mixes

Any audio programme with a wide dynamic range should have been aligned to the K-20 standard

This results in -20 dBFS = 0 VU = 85 dB SPL, with headroom of 20 dB 

Schematic representation of the loudness scale devised by Bob Katz. A good alternative reference to counter the loudness war, later supplanted by LUFS measurement following broadcasting regulations and the advent of streaming platforms. The 3 highest points of the red zones (shown here in dark grey) are aligned with the 0 dB level of the digital scale.

To put it briefly, the aim was to establish the reference dynamic amplitude for the various types of listening. For years, a few sound engineers (very few, in truth) aligned themselves to the criteria proposed by Engineer Katz, while the majority of them, pressurised by the manufacturers, continued to operate with loudness 'at the throttle'.

LUFS

Meanwhile, from 2006 onwards, the institutes ITU and EBU progressively elaborated a protocol aimed at limiting the dynamic warfarefinally defining a measurement standards with relative units of measurementwhich would allow the audio signal to be analysed in the best possible way, interpreting it mainly in the perceptual domain, in order to produce masters with standardised characteristics. 

The unit of measurement in question was called LKFSthen redefined and renamed LUFS from European Broadcasting Union (EBU) in the document EBU R128 of 2014.

This current measurement system allows an audio file to be analysed no longer on the basis of the scale RMSbut using a different protocol, with a measurement criterion that at its base is very similar to theRMSbut with added variables that take into account the psycho-acoustic perception of the average user.

The acronym LUFS  means 'units of volume measurement relative to full scale'.

This was originally a volume standard designed to allow normalisation of audio levels for television broadcasting.

LUFS is standardised in a set of algorithms aimed at measuring the volume of the audio programme and the level of its 'real peak' (for more details, read the ITU-R document BS 1770 and subsequent amendments introduced between 2011 and 2015).

LUFS are measured on an absolute scale and correspond to one decibel (dB).

The fledgling system was perfected in the following years, and the standard of:

-23 LUFS (EBU)

imposed itself in broadcasting contexts, also involving (to some extent) the film industry.

The level of:

-1 dBTP

instead became the standard for the maximum peak of the audio programme, thus providing ample margins to prevent any risk of clipping.

Soon the standard became rule of lawbut obliging only operators of the broadcasting.

L'record industryInstead, he responded by turning 'a deaf ear' as no producer would want to release phonographic products that sound 'lower' than those of competitors.

The streaming revolution

A new element was therefore needed, so decisive as to dissuade thephonographic industry from continuing with the loudness war.

The opportunity came with the spread of the streaming platformsphenomenon, a phenomenon that already in 2019 reached widespread distribution worldwide, and thus enormous power in determining the 'de facto' imposition of a standard

Le streaming platformsin order to ensure extreme homogeneity in the field of audio reproduction, they must be able to reproduce every genre of music correctly enough, from the most rarefied and delicate classical music to the densest and most intense heavy metal.

Such platforms must offer:

  • a sufficiently constant perceived average listening volume for all the tracks in their 'catalogue', although these volumes are extremely heterogeneous
  • an acceptable dynamic range in order to sufficiently respect musical expressiveness
  • distortion-free sound

This has favoured the 'de facto' imposition of certain loudness standards with very similar values, but currently not identical for all platforms.

Whatever the loudness original pieces of music in the streaming platformsthey will always suffer some automatic control processes and, if they do not meet the standard criteria imposed by the specific streaming platform, will be automatically processed in order to make them suitable for the required loudness values.

To this end, the platform will automatically limit the general volume of audio files with excessive loudness, in order to obtain a satisfactory leveling for all the tracks in the platform's catalogue.

It is evident that, in view of the above, placing tracks in such platforms characterised by a excessive compression of the audio file will only serve to flatten the dynamics also affecting the purity of the soundwithout having any real effect on the loudness which will be perceived by listening users.

This is progressively discouraging producers from continuing with the senseless Volume Waraddressing them to produce their masters with wider and more relaxed dynamics.

N.B.

While volume reduction is ensured on the one hand, on the other hand, the enhancement of audio files with a sub-standard loudness is not guaranteed. 

Consequently, it will be preferable, in general, to give tracks a slight excess of loudness rather than the opposite (e.g. if -14.0 is the standard for a specific platform, a loudness between -13.5 and -14.0 will be advisable rather than between -14.0 and -14.5).

Le streaming platforms are currently not perfectly aligned according to one common standardbut currently range between -13 LUFS (e.g. YouTube, the one with the most compressed dynamics) and -16.5 LUFS (e.g. Apple Music, the one with the most extended dynamics).

The trend seems to be settling around a possible single standard of -14 LUFSwhich is the one proposed by Spotify, which is currently the most important music streaming platform of the world.

For this reason, other 'smaller' streaming companies tend to align themselves with it, further favouring the definitive establishment of this measure, which is likely to become the sole and definitive standard.

For this reason, the major manufacturers of plug-ins for the dynamic finalisation of the master set by default to the levels of -23 LUFS for broadcasting and -14 LUFS for the streaming music, and in this sense they prepare software utilities, often providing it with a special level indicator, also contributing to the establishment of this standard.

This does not exclude the possibility of finalising multiple specific masters, with levels of loudness different, to better suit each of the platforms of streaming.

Reference loudness

Before working in finalisation, it is first necessary to clarify what the 3 types of measurement in LUFS useful for the purposes of our analysis:

Momentary loudness meter

Similar to traditional Vu-meter analogue, it expresses the volume oscillations 'in real time', imposing a discrete reactive inertia (approx. 400 ms), ideal for convenient level 'reading'.

Very useful to visualise the level of peaks in order to assess the appropriateness of more or less pronounced interventions in preliminary mix limiting

Short-term loudness meter

It expresses the average sound level, calculated over a short time pattern of about 3 seconds.

Very useful to smoothly follow the general trend of audio levels

It is characterised by a reactive braking speed, somewhat similar to the 'temporary memory' of many LED meters

Integrated loudness meter

Expresses the actual target, according to benchmarks and standards EBU - ITU

N.B.

In the revision of the ITU-R BS.1770 standard, the 'Loudness Gated' measurement concept was added, which 'intelligently' reduces the measurement of performance pauses and musical passages with a particularly low level.

A good integrated loudness meter will take this parameter into account in order not to obtain results that are out of the norm.

True Peak level

Digital audio processing, also due to ultra-fast limiting and problematic clipping, can produce inter-sample peaks (inter-sample peaks).

N.B.

Its analogue equivalent, post D/A conversion, would reveal a signal higher than the actual sample value, as the figure below makes clear. 

A peak like this is also called real peak level (True Peak Level).

According to the quality of the D/A converter used in playback, these peaks may cause audible distortions.

Of course, it will always be better to prevent or minimise the inter-sample peaks and at the same time ensure that each audio peak, normal and inter-sample, remains 'really' within the maximum undistorted limit of the 0 db digital.

A good dynamic finalisation plugin for mastering should be equipped with a function to contain the True Peaks (True Peak Limiting), conforming to EBU R128 and ITU-R BS 1770 standards.

It should be noted that true peak measurement (TP) is not an exact science: there are many different ways to implement ITU-R BS 1770-compliant measurement, which may offer slightly different results. 

Indeed, it will not be unusual to detect differences of a few tenths of dB TP between different true peak meters. 

High oversampling values, the ability to interact in look-ahead mode and high overall plugin quality could provide assurance of greater accuracy and  reliability in containing true peaks.

Recall that system regulations and conventions impose or suggest the use of a True Peak attenuation value of -1 LUFS or less; some streaming platforms even require a TP attenuation value of -2 db; in commercial mastering for audio CDs, on the other hand, the TP setting is usually less prudent, with values of -0.5, -0.3, -0.2 db, which exposes the master to risks of transient distortion.

The meters of a complete measuring system in LUFS. On the left the classic reference meter with 0 db digital and at its side the db measurement of the level reduction determined by the limiter. On the right the 3 meters of the LUFS system: short term (S), momentary (M) and integrated (I). On the bottom left the True Pick enable button and on the right its level adjustment (not yet set to the standard value of -1 dB).

The new standards

To draw a conclusive line, we can say that nowadays the trend is to use the following reference standards in LUFS:

Audio CD

  • -9 LUFS, with True Peak at -0.3 db - is the most widespread standard for rock-pop and derivative music, although unfortunately many producers still force the standard, pushing values of -8 and -7 LUFS

N.B.

Personally, even in these cases it is my habit to keep the TP at -1 db

Other more 'expressive' music genres, even for CDs, tend instead to choose solutions capable of greater dynamic space:

  • -10 / -12 LUFS, with True Peak at -1 db - for the most expressive modern music genres, such as fusion, modern jazz, 'cultured' and alternative pop music, and ethno-pop (this loudness range is gradually conquering more 'alternative' producers and I personally hope that it will become a definitive standard in rock-pop as well
  • -15 /-23 LUFS, with True Peak at -1.0 db - for traditional folk music, traditional jazz and classical music (that of a not strictly purist approach)
  • uncompressed dynamics, with True Peak at -1.0 db - for traditional folk music, traditional jazz and classical music (the strictly purist approach)

Schematic representation of the useful dynamics and relative loudness level used as standard in the most common uses. It is evident that, normalising to levels close to 0 db, we will have low useful dynamics and high loudness. Without any compression (right), natural dynamics, with expressions from 20 to 50 db and more (depending on the case), will be completely respected. Note also that the normalisation level for Pop Music CDs is generally set at peak levels of a few decimal places, with no control of the True Peak 'circuit'.

Streaming

  • the current most widespread standard, to be considered temporarily as a reference, is -14 LUFS, but may settle in the future at -15 db or -13 db
  • other streaming services currently fluctuate between -13 LUFS (YouTube) and -16.5 LUFS (iTunes)
  • However, for the sake of sound, there are many cases in which it is finalised at levels similar to those used for the CD, although the platforms will automatically penalise these levels to bring them into line with the standards of their publishing standards

Broadcasting and Cinema

  • the standard is -23 LUFS with True Peak at -1.0 db, which for broadcasting is also a binding legal standard

for cinema, significant fluctuations between -27 and -21 LUFS (with a short term loudness of up to -6 LUFS) were observed


For more on Digital Audio Mastering

Audio & Music Blog

Sign up not to miss a single article

You will also receive
GIFTS:
Discount on audio and music manuals
+ n.3 eBooks

We do not send spam! Read our Privacy Policy for more information.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.