TRUST BUT VERIFY.

Every recording and mixing space has a characteristic sound, unless it’s a rare and very expensive anechoic chamber, and, believe me, recording and mixing in those spaces would drive you bananas.

If you like to see graphs and charts, just scroll down and have fun.

Why this matters

Every room has trust issues.

As rooms get smaller, the sounds generated in the room respond to reflections and “resonances”. All of this adds or subtracts information from what you hear from your speakers (in studio parlance, these are called “monitors”, not to be confused with the screens used on computers.)

And if you want to (1) record well, (2) mix with speed and confidence, and (3) master — or finish — a mix to industry standards, your room is your first obstacle.

The trick

The trick is to choose your battles in smaller rooms. Nothing will be absolutely perfect. Smaller rooms are a bit like those meditation bowls that you tap with a mallet and then rub around the edge: It rings and rings and rings based on the physics of the bowl. The physics of a room is not far off.

The first thing to recognize is that the ringing of a room can be stronger in some places than others. The first trick is to find the place where your low-end sounds close to right. You do that with a combination of monitor placement and room treatment. Then you can work on other frequencies with additional room treatment. The goal is to create a reasonably large zone around where you listen to music that doesn’t significantly alter what you’re listening to.

You should look both at the frequencies (for example, low frequencies were the bass player plays, the mid-range where the singer sings, and the high end where strings, synths, and horn harmonics live), and at the decay time of sounds at different frequencies.

Frequencies are one issue, but time is critical

If your frequency response curve where you mix is some unknown combination of the actual signal from your monitors and the room’s “tone”, you’ll make bad decisions when mixing. You’ll make things brighter that shouldn’t be. Or you’ll adjust them to be duller than they deserve. And because “equalizing” a sound’s frequency includes a change in volume, your mix and the instruments in it can wind up too loud or too soft.

You can make some adjustments using an equalizer, but ideally you’d treat your room with a combination of overall absorption (“broadband”) and targeted absorption that focuses on problem frequencies and excessive reflections. This way, the room’s unique sound personality isn’t too much of a problem where you mix.

But you could get your frequency response really great from low to high frequencies, and still have your room ringing (like that meditation bowl). The very resonance of your room is by definition the result of some sounds simply hanging out too long. You want to absorb those resonances so that the overall decay is “quick enough” and “even enough” so sonic problems in your mix cannot hide from you.

STUDIOBLUE’S TREATMENT APPROACH

Every room must be understood and treated well enough that you can capture, mix, and master with confidence.

Our studio has two rooms for recording, one of which is the control room where mixing occurs. Both rooms have some level of treatment. Close mic’ing plus treatment, along with placing the microphones in the right place in the room, make for strong recordings with a minimum of “room tone”. The control room, in particular, has several locations where musicians can set up to get good, quality tracks. Our mic locker is filled with good choices for different instruments as well, with the pickup patterns on these mics being helpful in capturing the tone and varying the level of “room tone” in the tracks we capture.

The isolation room is separated from the control room by a heavy, dense wall made of 1940s-era thick wood. Then entire studio is built on a concrete slab. This combination “isolates” sounds extremely well, minimizing bleed. Any remaining bleed can usually be further reduced in the mixing process through noise reduction tools such as gates, artificial intelligence tools, and other processing (e.g., frequency slicing). We have some premium tools for this, but rarely need to use them because of the isolation we can achieve.

The isolation room also is a great place for clients to listen to mixes, or listen back to tracks. We have a pair of monitors in the space, positioned to permit a client, or the talent, to review recorded tracks at a “sweet spot” that has been calibration by IK Multimedia’s room correction software, giving an essentially flat response with minimum resonances from 50Hz to 18kHz.

The control room has gone through several iterations. In the immediately previous iteration, using IK Multimedia’s room correction filters, the room had a frequency response curve that was within 1.7dB (plus or minus) from 28Hz to 20kHz. The time domain remained a problem around 50Hz, however. For example, in the prior version of our treatment, the room’s decay was consistently below 270 milliseconds (ms) from 70Hz to 20kHz. This is not “dead” but it is quite well-controlled. Below 70Hz, our bass treatment was doing an OK job except at 50Hz, 40Hz, and 22Hz, where decay times were 400ms or greater (but less than 650ms in most cases). Not bad — but not good enough for our standards.

We added two “super-chunks” on acoustically transparent shelves, scientifically placed a certain distance from the corner and walls near the mix point, which serves to extend downward the frequencies we can absorb.

The results in the low end are very good now at the mix point.

Our target was to get the problem frequencies to drop 40dB as quickly as 350ms. Right now, we have:

51Hz is -40dB at 300ms
39Hz is -40dB at 350ms
22Hz is -30dB at 350ms.

Notice that 22Hz is not as far down in volume as would be ideal in a big, expensive control room. However, for the music most people listen to — and most of the music we record and mix — the amount of musical information at 22Hz is virtually none. In fact, many engineers “roll off” (lower in volume) most sounds below 35Hz or even below 50Hz, with excellent results. For almost all purposes, except for super-crticial electronic dance music or similar genres, the 22Hz performance of the control room is perfectly acceptable — indeed, most of the control rooms we’ve worked in nationally are no better, and often worse.

Our newest control room treatment: Great results

With the new super-chunks in place, then, the low end is under control at the mix point.

One side effect of the super-chunks is that they have reduced the size of the mixing sweet spot for other frequencies. Previously, we had a very even, time-aligned sweet spot at the mix point that was about the size of a four foot sphere, with a smooth frequency response of 1.7dB (plus or minus) from 28Hz to 20kHz, as mentioned. We achieved this with IK Multimedia’s room correction software and our good-quality room treatment.

With the smaller sweet spot, we’re using a pink-noise system that uses a calibrated microphone to measure and adjust the sounds coming from the monitors to create a flat response (it’s slightly more complicated than this, but this is not too far off a description of the approach.) In the second quarter of 2022, we’ll expand our method for tweaking the sweet spot, too.

But for now, with the time domain in the low end handled really well, our current measurements for frequencies from 22Hz to 20kHz shows variations from flat of plus or minus 4dB, with most frequencies being 1.5dB to 3dB away from a flat response. The “psychoacoustic” method of describing this (which accommodates how people actually perceive sound) has the frequency response within 2.5dB (plus or minus), except at 5.6kHz, and 9.5kHz, where slight dips bring the response down 4dB from flat.

These results are excellent, and will only get better.

MIX TRANSLATION

The number one goal of having a great sounding control room is to be able to trust what you’re hearing so you can make quick, reliable decisions. But even the best engineers in the world’s most expensive rooms still need to check their mixes. We do that as follows:

Multiple headphone checks: We have all the main industry-standard closed-back headphones, along with the semi-open Beyerdyanmicas DT770 and the open Avantone Pro MP-1 magnetic planar headphones.
Modeled room checks: IK Multimedia software on hand can adjust monitor output to model a dozen different monitor types. In headphones, Waves and Acustica Audio give me a dozen rooms and forty different high-end monitor models in high-quality emulations, from Abbey Road Studio 1 and Chris Lord Alge’s MIX LA rooms, to Nashville’s Ocean Way and a two dozen high-end studios in Italy and the US. I don’t use all these rooms, because mixing into these emulations requires that you get used to them, just as you would with any real control room. I choose contrasting rooms and monitor models to get (normally) three to four different sonic environments, in two to three different headphone models. This works extremely well. (Some world-class engineers ONLY mix on headphones, by the way. And most engineers who travel need options on the road, including headphone mixing systems, to help them speed through their work.)
Checking in our isolation room: As mentioned, I have an excellent midrange monitor pair in the isolation room that has been “room corrected” to be trustworthy. It’s a great way to double-check how your mix is sounding.

MEASUREMENTS (January 2022)

Waterfall graph. Frequency response is vertical; how long a frequency "rings" in time comes towards you. The measurement signal was 89.6db SPL. You can see essentially tight absorption to about 250ms from 150Hz to 20kHz. I left a marker at 296 Hz to show a slightly longer ringing, but at a frequency that is slightly less loud than surrounding frequencies. Also not that low end does indeed ring more from about 57Hz down. The loudness at 22Hz is notable, but most music will not contain information here, so it's not an issue. From 30Hz to 55Hz, the loudness is 42-50dB quieter -- that's really hard to detect when mixing.

Clarity measures how voice and music maintain their intelligibility in a space. The D50 measurement is really flat, especially in the first 50ms (that's where the "50" comes from in the name D50) -- that means that early energy from a source (say, a voice in a voice over) is strong and clear. Music clarity is often measured by C80. Voice clarity is often measured by C50. The energy for C50 is somewhat low in the 120Hz range, meaning that a person with a very low voice (think James Earl Jones) might experience slightly less clarity than other people). His voice is probably centered to 200Hz. C80, for music, has the same energy dip, but it's all within normal specifications.

This is a graph that shows you the time domain in a waterfall graph below, and allows you to "slice" that waterfall at any given frequency so you can carefully examine how any given frequency decays over time (shown in the upper graph). In this case, I chose 296Hz as an example (it rings just a bit, and it's in the normal singing/speaking vocal range). You'll see in the upper graph that 296Hz, when sounded, starts at a "baseline" value of 0dB, and drops 50dB in volume by 275ms. (This is about a quarter of a second, and is a very good result.)

This is another graph that shows you the time domain in a waterfall graph below, and allows you to "slice" that waterfall at any given frequency so you can carefully examine how any given frequency decays over time (shown in the upper graph). In this case, I chose 40Hz as an example (it rings just a bit, and it sits in the lowest range you might normally want to have in your mix). You'll see in the upper graph that 40HzHz, when sounded, starts at a "baseline" value of 0dB, and drops 40dB in volume by 425ms. But even at 350ms (about a third of a second), this frequency is 41dB down, so the mixer will experience this low-end coloration essentially for only 1/3 second. This is also an excellent result. The 22Hz ringing we've touched on before: The chances of musical content actually being present and maintained in a mix at that frequency response are very low, if not negligible. In short, this RT60 Decay graph shows good low end control for most mixing scenarios.

This is the RT60 graph. For small rooms, measuring RT60 is not very useful. This is a measure of time it takes for a sound to drop in power (loudness) by 60dB. Industry experts usually use T30 or TOPT (the latter because the optimal time for reverb decay). Here the graph shows that T30 and TOPT overlap almost completely. There is a jump at about 115 Hz. This is common in studios, and can be caused by a combination of the presence of a desk and the distance between floor and ceiling. It is normally not an issue when mixing. This doesn't mean we're going to settle on this result, but it is well within industry norms. You will see that, overall, most frequencies are well-absorbed by 300ms, an excellent result -- indeed, the room performs even better, with strong absorption down to roughly 200ms from low-midrange to very high sounds.

THD is total harmonic distortion. The THD in low frequencies is within industry norms. The boost at 35Hz is likely a by-produce of the subwoofer placement or crossover. This is something we can work with over time, but it is not an obstacle. See the noise floor chart.

This is the noise floor in the room. Except for the 40Hz bump (generated probably by some vibrating materials in the room, TBD), it is consistently around -80dB, which is superior for control rooms.

No hiding allowed: This is the frequency response of the control room at a resolution of 1/24th octave. You'll see a dip at about 120Hz (possibly a desk reflection, but it also coincides with a number of "resonances" in the room). It's sharp, and about 9dB. The sharpness indicates it will not generally be an issue because the lost energy is very narrow, but if a song is in a key that features that frequency, it's wise to double-check on headphones or other monitors -- as we do. There's a fair amount of very high-end variability from 9kHz upward. We believe this is caused by reflections from the materials used to contain the "super chunks" (big volumes of rockwool insulation) we use to tame the low end. Looking at frequency response at this level of detail helps a studio engineer and acoustic designer find and improve issues. Whether there's a short-term issue is actually another story; this resolution does NOT always indicate how an engineer will actually perceive sound. You can use lower-resolution images for that, or even the psychoacoustic version of this data, to find out what may be concerning -- or not!

This is the same frequency response measurement as shown in the 1/24th octave resolution and in the psychoacoustic display. You'll see that most variations are less than 5dB, usually around 4dB or less. The high end variability is still at 7dB (peak to trough). See the psychoacoustic measurement.

The psychoacoustic lens on the frequency response data uses industry research on sound perception to indicate how an engineer may interpret the sonic profile of the music s/he is mixing. Generally we see that diversion from flat response is generally 2.5dB except 700Hz and 6kHz. Here, the variation from flat is 3.5db - 4dB. Excellent results.