Subtitles section Play video Print subtitles DESTON BENNETT: Thanks for coming out today. Again, my name's Deston Bennett. I'm with the Grammy producers and engineers wing. The Grammy Awards, as many of you may know, are the only music awards that are peer determined, meaning it's not the public that votes. Those who vote are members of the Recording Academy, and who are hands-on music creators-- artists, songwriters, musicians, producers, and engineers. From the very beginning at our founding in 1956, the basis for the Grammy Awards process has been a commitment to excellence. The Recording Academy's original credo clearly states that the awards are not about sales, and they're not about popularity. Musical excellence in all areas is the only criteria Grammy voters are charged with to determine who gets nominated, and what will win. Knowing that little bit of information should help you whenever there's some controversy about the Grammys, as there sometimes can be. For example, the year jazz bassist Esperanza Spalding won the Best New Artist category against fellow nominees Drake, Florence and the Machine, Mumford and Sons, and Justin Bieber. She won because the majority of the Grammy voters were familiar with Esperanza and her work, and they saw her as a stand-out that year. It's pretty cool when you think about it. Many of you who are musicians or audio producers or engineers may be eligible to be members of the Academy. And I'm happy to speak to you about that after this is over if you like. You can also get some more information or join by visiting Grammy365.com. Today specifically, we're here to talk about excellence in sound, something that's key to great recordings. The P&E wing has partnered with the Consumer Electronics Association and others on an initiative we call Quality Sound Matters. We represent people who truly understand the difference good sound makes, and we want to share their enthusiasm and excitement about quality with everybody. Today, we have a very cool presentation from Grammy-winning engineer that we think you'll enjoy. And I want to give a big thank you to Neil Annala and Joe Rosenberg for bringing us here today. We'd also like to thank JBL and Prism Sound for this amazing sound system you're going to hear today, too. The speakers in particular, they encompass some very new, exciting technology that you're amongst the first to hear. And to top it off, I really want to introduce an amazing engineer producer who's worked with artists including Metallica, Lincoln Park, Green Day, and U2, along with last week's-- well, not last week, but a recent number one album, the Black Sabbath project. He's a two-time Grammy winner for his work on the Red Hot Chili Peppers' "Stadium Arcadium" project, as well as Adele's "21" album. He has another interesting honor. In 2012, he was named the International Engineer of the Year by England's Music Producers Guild. Please welcome Andrew Scheps. [APPLAUSE] ANDREW SCHEPS: First of all, thanks for coming. This is as full a house as we could have in here, I think. So thank you so much, and thanks again to Neil and Joe for putting this together. This is awesome. So later on, we're going to listen to a bunch of stuff, which is the point of what I do. And the Recording Academy has been really great about sponsoring me to do this talk all over the country. The idea of the talk, it was originally put together for what the Recording Academy called their Grammy Future Now conference, which was sort a mini, one-day TED conference for producers and engineers, for people who make music in Los Angeles. And since then, I've gone around the country, and most the time I give this presentation to producers and engineers. And it's because there's a lot of information in the presentation that, as people who make records, we sort of kind of know, but we don't actually know. And so I'm trying to put numbers and facts behind the things we think we know so that when we listen, you can actually compare things and you know what it is you're listening to, and why there might be differences and things like that. So I'll start the way I usually start by asking how many people in this room are artists and make records, or have ever released a record. So still good number of you. So if you've released a record, how many of you have then gone and bought your record to make sure that what comes off the services sounds like what you sent them? So that's about normal. About a third of the hands, maybe less. And that's exactly the same with people who do that as their day job-- or night job, depending on the hours your keep. It's not something people really think about. They finish their record, they master it, like oh, it's done. Send it off, and you're done. And of course, now with all the digital services-- and we'll get into lots of them specifically-- and there are a few that happen to be housed in this building or down in San Bruno-- there are lots and lots of different ways that music gets out into the world. And so, the idea is to give some context to know, what are all these possibilities? How do they compare? And do they actually impact the consumer's experience when they listen to your music? So that's the idea. Now along the way, I can usually get away with a lot of sort of vagaries, because I'm talking to producers and engineers. All right, this is the question just for you guys. How many people in this room know more about digital audio theory than me? There's going to be-- come on. It's everybody in the room. But seriously, how many people work directly with digital audio in the room? OK, I'm going to be vague. I'm going to be slightly inaccurate, and I would welcome corrections along the way. So I've done the presentation, I think, 12 times now. 11 of those times were for producers and engineers, and once was a few weeks ago, which is when Neil came and saw me at Fantasy Studios in Berkeley. And that was for a room full of people from the tech community, including people from Google and YouTube and Google Play music, as well as SoundCloud and Apple and Rhapsody and Arteo, and a few other companies, and Fraunhofer, who developed the MP3 and ADC codecs. And I got my butt kicked. And I'm fine with that. I would love to get my butt kicked, because every time I give this presentation, I know more, and I can kick butt back a little bit, which is the point. And I think what happens is people get into their little rabbit holes on what they work on. So I make records, and I want to make great sounding records, but I don't want to follow it through the food chain down to the consumer, because that's not what I do. Now two years ago, I started my own record label, so now that became part of what I do. And I used to think, I'm going to start a label because the labels suck. They don't know what they're doing. It turns out I don't know what I'm doing, and it's really, really difficult, and there's a lot to it. So every part of this process of getting music into the hands of people who listen to it is unbelievably difficult, incredibly technical, and fraught with peril for the audio along the way. So we'll talk about some of the specifics. So what I want to do first, though, is put recording into perspective, OK? So for thousands and thousands of years-- and now we start my very fine PowerPoint-- there has been music. OK, for who knows how many? Let's say 10,000? Is that a good number? This is where I get vague, and everybody in the room backs me up. So we're going to say 10,000 years, there's been music in the form of songs that have been written by somebody. And then, they would perform their song for somebody in their village or something like that. And the only way music could propagate would be either they would go to the next village, or they would teach their song to somebody and then they would go to the next village, or people from the next village would come hear them and go back. Right? So there's your music industry for the first 9,900 years. Fair enough? OK, about 100 years ago-- a little bit more-- but basically, about 100 years ago, there started to be consumer recordings of audio. And there were a few things before this, but let's say the wax cylinder was the first viable format. So you have the Edison cylinder where people would come into a room. They would make lots of noise. That noise gets collected by a horn. It would get scratched on to this disc that's spinning around. And then, you could take that disc, and go play it back elsewhere. So all of a sudden, you have created what is called recording. Recording, especially back then, was technically just at a delay process, right? So you perform the music, and then you capture it for a second, and then you can carry it around. And then later on at any point, you can play it back. So now, you can get rid of some of the space and time constraints of everybody come to your concert. Now you can record your concert and send it out. Now this caused a huge uproar. And in researching this for this presentation, and also I teach a recording class where I try to give a little bit of a history, there's some amazing quotes from John Philip Sousa and people like that about how recording was going to destroy not music, but society. Destroy it. You have to be in the room with the musician. So I think we've all kind of gotten over that. I mean, I would hope everybody here enjoys going to concerts and things like that. But we've gotten over the fact that we're going to completely destroy society. Music isn't the only thing that's destroyed. And it's just one of many things. OK, so that's 100 years. That's it. Then about 50 years ago, mainly with technology out of Germany in the '40s, and then also some techniques developed bouncing from one tape machine to another tape machine, you started to be able to not just capture a live performance, but you had what we call overdubs, which is basically, you make a recording, and then you record some more stuff to go with it. So now, you can record at different times, and all of those things up to make a recording. A lot of the early Beatles recordings were examples of bouncing. They would record the band, then they would play the band back while recording something else, combine those together. So that was a technique. The German tape machines allowed you to actually have multiple tracks that side by side. So you record on a couple of tracks, then you record on another track. So things we're sort of familiar with. But that basically '40s, into the '50s-- but even in the '50s, most commercial recordings were live recordings, to mono or possibly starting to get into three-track tape, but eventually going to be mono going out into the world. But once you start having these multi-track tapes, then you have to mix those things together. So this created something in the music industry that didn't exist. It used to be there were only recording engineers who captured things. Now all of a sudden, you needed people who could take all the stuff that was captured, combine it together, and make it something that could go off into the world to be heard. So that's the mix of the recording of the song with overdubs. And then, once you actually had consumer formats-- whether it was the cylinders or onto LPs or 45's or 78's or cassettes or eight-track tapes, up into CDs-- you needed to have some sort of standard as to how the music would have to be put onto these media to then be distributed. So you would get your mastered mix of the recording of a song with overdubs. Now in a room full of engineers, that kills. That's really funny, because it's a font joke. Mastering makes things loud. That's the idea so-- all right. I'm sorry. It's the wrong crowd. OK, so this is now what the artist sends off into the world, OK? This is what a record is. But it's much more than that, right? Music, pre-recording, was nothing more than art. There was some commerce involved, but it was basically art. It was musicians and composers who would have a piece of themselves that they would want to capture, and then let other people here it, and recreate the emotion they were trying to create when they performed the music live. So let's say that really, the recording is more like this. And you don't have to read it. The point is I needed to get a lot of text on the screen for later on for one of my very clever, inaccurate analogies. The idea being that we need to keep in mind that this is art, and this is the difference between looking at an art book and going to a museum, OK? There are differences. And the idea of live performance versus recording is one stage of this difference. But there's also a huge difference depending on how that recording gets to you at the end of the day. And when we actually get to the listening portion, I think someone once said it's stuff that you can't unhear. You'll hear the difference between some of these file formats and bit rates and things like that, and you'll decide for yourself whether it makes a difference. My theory is I think it does. OK, so now we're going to go through part of the presentation, which is a little more technical, which means it's a little dumbed down for most of the people in this room. But there are a couple important things. So the first thing is, the difference between sound and audio. And I'm sure most people in this room know this, but the idea that's important is that all sound is analog, period. An analog meaning infinitely variable, OK? Until you get down to the molecular quantum level, any sound in the air is infinitely variable acoustic pressure waves that travel around the room, right? Everybody cool with that? Now, you can buy a digital microphone or a digital pair of headphones, and that isn't actually what they are. They are analog microphones and analog headphones that happen to have converters built into them. So they are two things in one. But they are an analog device. There's no such thing as a digital microphone. The only way you can record something is to put something in the air in the way of the pressure wave so it moves because of the pressure wave, and then using lots of different technologies for how you design your microphone. You turn that into a voltage is the most common way to do it. Then, you can digitize the voltage, OK? So this would be the simplest sound. It's a sine wave. It's information at only one frequency. But the idea is while it's a sound wave, you zoom in, you zoom in, you zoom in. It never pixelates, right? It's smooth all the way down. So the idea of digitizing-- and this is where, feel free to take a nap or something real quick. So obviously with digital systems, you don't have the luxury of looking at something infinitely many times a second, right? You have to have a clock. You have to decide how many times you're going to look. So for the producers and engineers I talk to, this is actually really helpful. I know it's very simplistic, but it's just the easiest visual representation of what sampling is. So the idea is, time across the bottom, voltage up and down. And every time there's a vertical line, that's a sample. So how many times a second-- let's say that's a second, and then we count the number lines, and that's how many times a second we're looking at it. And each time we look, we say, how big's the voltage? And we write it down using a number. And how many bits we get to write down that number are our horizontal lines. Everybody's good with that, right? So the idea being that if you look at this particular grid superimposed on the sine wave, we almost never go directly through an intersection. So we are always wrong. We are always rounding. And obviously, anyone in the room who really knows digital theory knows that that's OK. There's one quantization error, but you make up for it, and you can reconstruct things quite well. You'll also know this sample rate is way higher than we actually need to capture this sine wave. You only need just over two samples per cycle, and your good. So that's fine. And I'm not saying that this is not a good sample for this particular sine wave. But as a visual representation, it's important. The idea being, though, if we want to be more accurate, we can do two things-- we can up the sample rate, and we came up the bit depth. So now, this is sort of the aha moment for a lot of engineers who've got little pop-up menus for sample rates and bit depths, and they don't actually know what they do other than bigger is better, So I'll record more stuff. Now there are diminishing returns. In terms of actually building audio hardware, it's very hard to build something that will work equally well at every single sample rate. And I do lots of listening tests for just my studio for making records, and I found that there's a lot of gear that works great at 96 kilohertz, and up at 192, it doesn't really work so well, because some things are getting stressed, and it's just not optimized for it. So it's not always that higher sample rate is better. But in a perfect system, a higher sample rate will be more accurate more of the time. Right? I mean, I think that's fair enough to say. And the same thing with bit depth. And in some ways, bit depth is more important than sample rate. Now the other thing is you could very easily make the theoretical argument that 44.1 kilohertz is fine, because human hearing goes up to around 20 kilohertz? And I know everyone probably already knows this, but basically, take your sample rate, divide it by 2. That's the highest frequency you can capture at that sample rate. Fair enough? So 44.1, you get down to 22.05. Wow. 22.05. There you go. Sorry. My math just went out the window. But the problem is, to make that work, you need a perfect filter that cuts off everything above that frequency, but doesn't touch anything below it, right? That filter cannot be built. It doesn't exist, especially as an analog filter. So this is part of why higher sample rates are really important for capturing things-- to get an accurate picture at 20k, you kind of need to leave it alone out to 40 or 48, something like that. So if you start working at 96, and you can either use very gentle analog filters or you can start getting into over-sampling and digital filters, but you can do things way past where we hear that are brutal, and they don't affect what goes on down where we do hear. Now there are also people who argue that we respond to frequencies above 20k. We're not getting into that. We're not getting into, we should be tuning everything to 436 instead of 440. There are lots of holistic arguments about lots of things, and I try and keep things more real and in numbers, because then I don't have to argue about them for 12 hours, and not get anywhere. So I try and keep it that way. So anyway, this is basically what I try and impart about sampling, even though you guys know most of this. So then we start talking about the actual consumer formats, OK? Now there are two types of digital audio files. Again, I'm sure you guys know this, but there's lossless audio and there is lossy audio. OK, lossless audio is take a PCM-encoded wave file at some sample rate and some bit depth, and you keep all the numbers, period. That's it. That's all a loss is. It's AIFF, WAV, used to be Sound Designer, too. OK, so those are loss files. Now if you want to get into the analog versus digital debate, they're all lossy, right? We've thrown away some information. But we're not there. Let's say that our capture is awesome. Let's say we're working at 96k, 24-bit. We've got lots of information. If we keep all that information, it's a loss file. Lossy is-- and again, I'm just going to go through the presentation. You guys know all of this already, which is why it's so great. So lossy is the difference between zipping a file, and using something where when you unzip your 25-page paper you've just written, it's missing a bunch of letters and there's stuff spelled wrong. And again, for a lot of producers and engineers, they don't actually understand this concept. They assume that lossy compression is still OK, because you end up with a PCM audio stream at the other end. But it's reconstructed, and stuff is thrown away to actually make those files. And the reason being that if you zip an audio file, you save maybe 20%. If you use FLAC, which is optimized for audio, you can maybe save 50% of the space. But that's it. So if you do some quick math, and you're looking at a CD, let's say, which is at 44.1, 16-bit, you're talking about 10 megabytes for every minute of stereo music. Those are big files. You guys spend a lot of time trying to get files from one place to another quickly and efficiently. Those files are too big, especially up to a few years ago with the data pipes going to phones, all the mobile devices. There's no way you're going to send that much audio. So this is why the lossy codecs actually exist. So very briefly, Fraunhofer, which is based up here, developed first the MP3 lossy codec, and then more recently, the ADC codec. These are based upon the way you hear. If you know anything about the way your brain processes the information from your ears, your ears have just got lots of hairs in it. And Julie will probably talk more about this than I need to. But you basically are splitting things into different frequencies. All of that information comes up into your brain. Your brain then processes it, and decides, I don't need to listen to that, not going to pay attention to that, I hate that, screw that-- oh, that's important. And then that's what you hear. So there's lots and lots of information that's thrown away, which is why in a crowded room, you can concentrate on a conversation with somebody, because you start to mask things out. And the same is true when you're listening to music. There are lots of things that can be masked out. So through a lot of research, they decided, what can we throw away, right? The idea being that if we take care of getting rid of some of this information, then all of a sudden, we're dealing with a much smaller file. And if you compare file sizes, a decent bit rate MP3 is maybe 10% of the size of the uncompressed audio file. Yet in some listening tests, you might be able to actually do pretty well against the file it was encoded from. OK, so this is where I get very inaccurate, and people actually got mad at me about this. But that's OK, because I'm up here and you're back there, and you'd have to jump over the screens. So this is the way I explain lossy encoding to people. So if we go back to our paragraph of lots and lots of text, if I take out some of the vowels, everybody can still read this just as fast as they used to, right? The idea is your brain is predicting what should be there as much as it's taking the input of what actually is there. So if we look at the word "mastered" in that first line, as soon as you get to the M and you see the "stered" after it, your brain has decided there's probably an A there. There's room for an A there. There's an A there. It fills in the blank. If you have a tiny little smudge on the page, your brain is all about it. That is an A. Absolutely. Whereas on its own, that smudge is nothing. It's a smudge. So that's the basic idea, is finding what can we throw away, and still be able to read as fast we can? Or, listen and enjoy the music without having to figure out what it was supposed to sound like? So the idea being that if I only take out those vowels, we don't save a whole lot of space. If I take out all the vowels, now we're really starting to save some space, and we can compact it down, but I can no longer read this, OK? So somewhere is a threshold. The problem is when you're reading, you have very discrete chunks of data. You either know what that word is, or you don't. Maybe you can fill in a word from the context around it, but that's kind of as far as you can go. When you're listening to music, at some point it just sounds bad, and you don't really want to listen to it anymore. Sometimes it sounds so bad that it's kind of crazy and it sounds like it's under water, and more like whales than music. But until you get to that point, it's very hard to say, yeah, OK, we compressed too much, because you could put someone in the room, and especially if they know the song, they'll fill in some blanks on their own, and they're like, yeah, I like this song. It's all good. So the problem with audio is you go from this analog sine wave-- which no matter how far we zoom in, is still infinitely varying. We capture it, we compress it, we send it off, we reconstruct, but we're starting to reconstruct something that's a little more stepped. Now again, this will get smoothed out by things in both the circuitry and also by your ears, so there are lots of things working to help you out in reconstructing this waveform along the way. But you compress too much, and then you start getting to things that start to not really sound like sine waves, or they've got so many harmonics on them that you don't hear them as a sine wave anymore. And at that point, you're listening to something different than what you started with. And I think that is more akin to someone who kind of sucks at art, copying paintings, and selling it to you, and like, yeah, I'll put that on my wall. Now for $10 and I can download it? Maybe that's a trade-off you're willing to make. But in terms of taking the art that this artist has made, and saying, this is my record, and I love it, and it makes my mom cry, at some point you're going to send them such a low bit rate file that their mom's not going to cry anymore. And that's a drag, because at that point you've lost the point of the music, right? It's art coming through speakers. It's emotion coming through speakers. So what can we do as record makers, and then what can we do as people who get that music out into the world to help people listen to it? And the great part is, I would assume that everybody in this room listens to music recreationally. Let's start with the hands of people who don't listen to music ever. OK, so not only are we in charge of making this music and getting it out there, but we also consume it, so we want to make products that we actually like, which with a lot of things, people don't actually buy their own products, whereas this is sort of the ultimate consumer product, because everybody's into it one way or another. So going back to the actual consumer formats. Within the loss category, you've really only got two choices. You have CDs, which are dying a very quick death, which are set at 44.1 16-bit audio, right? Then you've got what is called high res. And this is a term that people can argue about. All it means is anything better than 44.1 16-bit, OK? So when the Beatles re-released their catalog, I dunno, six years ago, something like that, there was a version you could buy on a USB stick which was 44.1 24-bit. That is high res audio, because it's higher than a CD. So that's what the term means out in the audio world. Now for me, I like to think of high res being up at 96k or something like that. But in terms of consumer audio, that's what you get. Now in terms of buying high res audio, there are very, very few options. There's HDtracks, who will sell you things to download, and there's this crazy Java file. OK, has anyone bought anything from HDtracks in this room? So a few people. Is there anybody who thinks that it's so easy to download and play back this stuff that everybody should be doing it? OK. Got a couple. So there are a lot of things involved, and I'll talk a little more about what I have set up here to play this stuff back. It's hard to get the high res music, and it's hard to play it back properly. It's easy to play it back wrong. Anybody can do that. Just throw it in iTunes or any other music player, it'll play back wrong, and you're all good. But you're getting into transcoding, and things that you don't really want to get into. But anyway, that's what you've got for the two viable sort of ways you can get lossless audio. There are a couple others that, once we start listening-- excuse me-- once you start listening that I'll actually show you, which are kind of cool. There's high res streaming starting to happen, adaptive streaming. It's really awesome. OK, then we get into the lossy formats, and those files are basically MP3 and AAC, which are the two Fraunhofer codecs-- AAC having not necessarily superseded MP3, but just coming after. I think Robert from Fraunhofer would argue that it supersedes it. But obviously, there's tons of stuff still coming out on MP3 as you go. Depends how you encode things like that. Then there's ogg vorbis, which other than Wikipedia, I don't know much about it. Is it that it's open source? OK, so it's the open source encoder. There you go. But of course, there are open source MP3 decoders, which skirt Fraunhofer's license. Because if you get the lame encoder, you're not paying them, either. So I don't know. That's vague. Yes? AUDIENCE: It's totally patent free, as well, but that's debatable. ANDREW SCHEPS: The ogg vorbis? OK, so ogg vorbis is patent-free, which I guess would be the main difference. Because if you can build yourself an MP4 encoder that's open source, you're getting around-- anyway. Robert and I had a very long conversation about this, and he was awesome. He was very, very good about this. I thought he was going to kill me, but he was great. OK, so if we actually start looking at the services themselves, this is where for the producers and engineers it's a big, big deal, because this is the stuff where they don't necessarily understand things. I mean, they understand, but it's the stuff you know but you don't know. So the CD and high res are both, I'm going to say, WAV. You can buy it is FLAC, but that's just compressed WAV. There are AAIFs and things floating out there. But WAV is the most robust and the most prolific form of uncompressed audio. Everything else is not WAV. OK, so it's either-- all right, first of all, who here is from the Play Music? OK, I need an answer, because I have scoured your website, and it says it plays up to 320 kbps files. So what format, and what does the up to mean? I can't-- is that an NDA thing? AUDIENCE: [INAUDIBLE] ANDREW SCHEPS: So it's MP3s. AUDIENCE: [INAUDIBLE] ANDREW SCHEPS: And I'm assuming it's scaled, so as you test bandwidth, do go-- I'm going to guess, 128, 256, 320? AUDIENCE: 192, 256. ANDREW SCHEPS: OK. So three tiers topping out at 320. OK, I couldn't-- and this is part of the problem of looking for this stuff. And I don't think-- and you can correct me if I'm wrong-- I don't think anyone is intentionally being obscure about this. Maybe you are. Are you being intentionally? Are you obfuscating? I love that word. Maybe you are a little bit. OK. So-- yes, sir? AUDIENCE: If people in front can move maybe towards the back of the room, we're going to playing stuff out of those speakers. ANDREW SCHEPS: Yeah, that's going to hurt. I think what we can do actually is, what's going to happen is at about 10 to 5:00, Julie's going to speak, because she's got a presentation about what she's doing. And we're technically sort of 4:00 to 5:00, but we also have the room to 6:30. So what I'd love to do is we've got 15 minutes, I'll finish going blah blah blah. We can maybe do some questions where you guys kick my ass. Can I say ass on this? It's internal, right? You can kick my ass. And then, we'll break for Julie to speak. And then, we'll do the listening, and people who have to go can go, but then we can also shove people into the middle of the room, because you guys are going to get killed. I mean, I'm not going to have it crazy loud, but still, you're going to get killed a little bit. OK, so finding all of this information. OK, how many people from YouTube? Do we have anyone? OK, so we got a couple. Finding out the information on what happens with the audio on YouTube was not that difficult, but it was also a little odd in that-- so does everybody in the room know why there are two bit rates, and everybody in the room know when you get which one? OK, there you go. So here's the problem. It's tied to the video rate. There's no setting that says, give me good audio. There's only the setting that says, give me good video. So basically-- and you can correct me if I'm wrong-- 720 and 1080 give you 384. Everything else gives you 128, OK? Here's the problem-- a lot of people can't afford to make videos for every song on their record, and a lot of people who buy records and then really like a song and want to upload it to YouTube don't make a video that's HD for that song. So you upload static art work, or you upload lyrics, or you upload a picture of your dog-- or cats. Cats are the internet, right? So it's kittens. But unless it's awesome footage of a kitten, nobody is going to switch to HD. Nobody. And it doesn't default to HD, so nobody here's your music at 384, which is, in terms of pure bit rate, the highest of the lossy formats available, period, and nobody hears it. Yet from numbers I've seen, and I'm sure my NDA won't cover this because I haven't even signed one, but for numbers I've seen, 80% of music discovery happens on YouTube. Somebody says, hey, have you heard vrr, and I go, I don't know, let me search for it. And you put it in, and you listen to it on YouTube. So 80% of the time, people are being introduced to music with one of the lowest bit rates on the board, when the highest rate on the board is actually there, though not available for most of the videos, because people aren't bothering to upload HD video. And should be just to finish up the YouTube thing right now? OK, and this is something I'm hoping-- I know I'm speaking with some of you tomorrow, but I would love to get-- my email address is my name, andrew@scheps.com. Hunt me down, find me, because I'd love to discuss this stuff. Because another thing is going through all of the YouTube documentation, there's nothing that I could find about audio upload guidelines. OK, so there are no audio upload guidelines on the YouTube site. Zero. The problem is, of course, what you're ending up with are 128 and 384 AACs, but most of the time, people are uploading lossily compressed audio. So you're transcoding. Is there anybody in the room who disagrees that transcoding is the worst sounding thing you could ever do to a piece of audio between two lossy format? Because we'll fight later. OK, there are amazing sounding lossy encoded files. 384 AAC, I would defy most people to sit in a room, do double-blind test between 384 AACs properly encoded and CDs. I would defy anybody to not tell the difference between 384 transcoded AAc that came from any other lossy format. It sounds terrible. This is one of the things we're hoping to move forward with. So anyway, this is one of the problems with comparing the services. But the big problem that a lot of the people I speak to normally have is they don't know how to compare the 44.1 and the 256, and zero consumers know how. 256 is way more than 44, right? I rest my case. But when you're trying to actually educate people about just what this is, you need to come and sit in a room, and have me go blah blah blah, and show you a chart. So the idea is that, again, as with any scientific thing, you've got to look at the units. And the kilohertz and bit depth is totally different from kilobits per second. Now the cool thing is that all of the lossy formats are actually very transparent with their bit rate. OK, this is, again, where I make records. I don't work with computers all the time. I'm rounding. There's no 1024. The numbers are very round, because it's easy for us people to understand. All right, so basically, I take your bit rate, I put three zeroes on the end, and that's how many bits per second I get to represent my stereo piece of art that makes my mom cry. Then actually do the math-- 44,100 times 16 times 2, and we're at 1.4 million on a CD. Now obviously, the codecs that encode the lossy encoders are very smart. So it's not like just take a percentage, and that's how much worse at sounds. I absolutely get that. But we're talking at a very big difference, and then you look at the 192 32, which is the highest I've seen coming off of HD tracks. And you're up to 12.2 million. OK, the problem being in the grand scheme of things that that's really not a whole lot compared to the analog we started with. So again, we're not going to go the analog versus digital debate, but how many people here like vinyl? How many people actually look to see if the vinyl's done from the analog masters instead of digital remasters? Get some old Blue Note. Even just compare it to some of the reissued Blue Note. And it's kind of astonishing. It's like your there. OK, so this is where we stop talking about numbers. And now, I want to go through this study very quickly. This is sort of an older study. Because of course, the thing is, does anybody care? If nobody cares, then we don't need to care, right? If this doesn't make a difference, and it's all just a bunch of numbers, I don't care. The idea is I want people to spend enough money on the music that I work on that the artists I work with cannot take a day job so they can keep making records. And I want to be able to afford to keep making records, and not necessarily take a day job, but if you've got something for me, we'll talk. OK? That's the idea. OK, we're not all looking to be on MTV Cribs, because we're not. OK, but if people don't care, then by all means, make the files tiny, because then everything else about the consumer experience is awesome. Instant on, very fast, move it from one place to another, fit 25 bazillion songs on anything that fits in your pocket. That's all good. OK, now Harman who were actually nice enough to send up this pair of speakers we're going to listen to later, this study is from a little while ago to be fair. But they decided, we need to actually know if people care. Because they don't care what the outcome is, but they need to know the answer to that question because they make equipment for people to listen to music. That's what they do. So they need to know, do we need to be really concentrating on stuff that plays back loss audio, or even high res audio? Or should we be building better MP3 hardware decoders in, and just deal with that? Should we actually limit the bandwidth? When we're starting to talk about wireless technology-- I mean, if you look at Sonos and RedNet and a lot of the really cool networked audio and wireless audio technologies-- where do we need to cap our bandwidth? These people need to know what people like, but they don't actually have a horse in the race, because they're just going to build the gear to play it back. So Dr. Sean Oliver, who works there, who's a pretty amazing guy, and he's got labs that have all kinds of stuff in them. They've got stuff that looks like it's out of an amusement park, so when you're A-B-ing speakers, they hydraulically move into the same place. You don't have the differences in placement when people change speakers and things like that. So what he decided to do was get young people, because there's a lot of sort of anecdotal evidence that young people not only don't care-- but this is the crazy one to me, and if you know anything about neurology and cognitive listening, it's even crazier-- but that kids these days have only heard MP3s, so they actually prefer them. Again, if anyone wants to discuss that later, I will talk about that for hours, because that's the rabbit hole I've been down for the last two years. But I'll just say that that is pretty much categorically not true. So this study from a little while ago was meant to prove this. So they got a bunch of young kids these days, or in those days, both high school and college age students. The only thing that's really important here-- well, there are two things. One is that, for whatever reason, they were mostly male students, as opposed to female students, studying audio, which is kind of a drag at all times. So that's just the way it works. The other thing is you see this last column, this level of training-- all this is is that these students were involved in a recording program, or they had taken a comparative listening class or a critical listening class, or something like that. So they were aware of audio quality as a thing, as opposed to just being someone off the street who really has never, ever thought about it, OK? So that's the break up. Here is what they did. And I-- all it means is they knew what they were doing, and it's scientific. OK, so it's true double-blind listening. These kids don't know what they're listening to. They come back multiple times, and they listen. OK, now this is between 128k MP3, which was what everybody was selling when they did this story. And you think, my god, that's the Dark Ages, but it's really, what, four years ago? Maybe five? Maybe five. That's what you could buy. So between that and CD. So we're not talking high res HDtracks downloads. 70% of the time, those stupid kids liked the CD. And this isn't even a what sounds better. This is a what do you listening to? Which one do you want to hear? All right, the important part of this is going back to this sort of threshold of where does my mom cry, is what happens emotionally? So part of one of my theories is, if you go back to that huge block of text, and you take out a bunch of vowels, at some point it's harder work to read. So while you will still understand the words, and enjoy the story maybe, you will be less emotionally invested because you're doing stuff. The same thing is true, I believe, when listening to lossy audio, because while your brain might throw stuff away, it's expecting it, and your brain gets pissed when the stuff doesn't show up. So you can create anxiety, you can create depression at very low levels, but at the same time, it's also filling in the blanks for you, right? You're taking away lots of acoustic things from the music. That's one of the first things to go are reverb tails and acoustic cues. So your brain is recreating. Therefore, it becomes more of an active process to listen. Now while that may not be that much of an issue, one of the anecdotal things that really sent me down this road is that my daughter had a friend in high school who was interning with me in my studio. And great drummer, really musical kid, listens to music all the time. And he showed up at the studio in the afternoon to work on something, and he came in, and he said, man, been listening to music all day and I'm exhausted. And I don't know how many people that sounds absolutely crazy to, but that to me is crazy, because I would wake up in the morning and put on records or cassettes-- even that I had recorded from a microphone in front of a speaker, so not the highest quality audio in the world-- but I would listen for 15 hours, and my parents would yell at me, and then I would listen to headphones in bed for a while. Even recently, I've gone to friends' houses who have these amazing set-ups, and we listen to vinyl all day. And as my wife can attest, I was down at this guy's house for 15 hours, and I got home at 1:30 in the morning and put on a record. I was not exhausted. When I listen to some of the streaming audio services, though, I get tired. I get a headache. I grind my teeth. And it's not an instantaneous thing. It is not an, oh my god, that's killing me and making my ears bleed. But it is, in terms of a long-term commitment, and I would also argue in terms of a long-term connection between people who hear the music and the artist. And one of the most important things with artists is that people actually connect with them on an artistic level. And that happens by them experiencing some of the emotion that went into the song. And it could be as simple as a lyric, which means you're in pretty good shape no matter what. But it could be because of the chord changes and the instrumentation and the subtleties of the performance. And when we start listening, you will, I believe, start to hear some kind of not subtle differences. We put the B back in subtle with some of the things that change when you listen back to back between some of the lossily encoded music and the lossless music. In terms of when you get to the second verse of the song, do you feel like, musically, I've already heard this, let's move on? Or do you feel like, god, what's next in the story? And man, there's a new guitar part. And these are subtle things. So if you love an artist, then it doesn't really matter. You will love them even if it sounds terrible. But what if it's somewhere in the middle? What if you're kind of on the fence? What if the audio quality actually determines where your threshold moves as you're listening as to whether you're going to listen to the next song on that record, or even make it to the end of the first song? And I know that part of people not listening all the way through to songs and skipping around all the time is just due to changes in consumer habits, and we're all multitasking more, and things like that. But for the people here who listen to vinyl, I think you may not always flip it to Side B, but how often do you lift with the needle in the middle of Side A-- unless you're DJing a party-- because you're just kind of tired of it, and now I want to move on? You'll generally have the experience of Side A. So you're getting 20 minutes straight of something. When you're just listening online, that doesn't happen so much. There's a lot of skipping around, and a lot of moving. But what I've got here-- I went to a few of the different labels. I've got 18 songs and a bunch of different genres, and I'll put up just a list of them. And you guys will DJ. And also, we can talk about anything. If anyone has questions or want to point out stuff I've got wrong, I absolutely want that to happen, as well. And we can do that while we listen, things like that. And I have them in as many formats as I could possibly have them in, including-- oh, we didn't make it to this slide. Sorry. Google Play Music, I've got my playlist from you guys. So hopefully because I'm on your ridiculously fast, free Wi-Fi, we'll be getting 320 the whole time I'm sure. But also, then I want to show you something called OraStream, which is adaptive based on bandwidth, which is awesome. And we'll talk about other stuff. Roundabout. OK, really quickly. The way I'm playing the stuff back is I'm using my Mac. I am playing out of a program called Decibel, which is just a very, very simple music player. And the only thing that it does is it switches the sample rate of the hardware to match the files. So that way, we're not doing any sample reconversion. In software on the way out of the computer, we get it out to the converter at its native sample rate. It also crashes a lot. It's a $30 program. But it generally works. I'm using this Prism Orpheus, which it's a one-rack space eight-channel audio interface. So it's amazing for recording, but I'm using it because it gives me a volume knob on the front. I'm just using it stereo going out. The reason I'm using it, as opposed to something a little more simple, is because some of my source material is at 192, so I need a box that'll go up to 192 without putting something else in the middle. I've tried as hard as I can to make sure that all of these different files are from the exact same master. So the same-- remember my font joke from earlier? Sometimes, that happens multiple times to a release. Roundabout is one of the examples. Sorry, we will listen really quickly. But I needed to say that Roundabout is one of the examples of something where it is actually from a different master, the high res version, because it was from a DVD audio release from, I don't know, eight years ago-- way back in the stone ages when that was a format for about eight minutes. So that is actually a different master. But still, it's a pretty astounding difference. Now I will also say-- and we can stop this, but anyone who was at the talk I gave at Berkeley knows that at some point, Robert from Fraunhofer made me stop playing things off YouTube because he said it's unfair because it's all transcoded, and made it him look bad. And I said, OK, that's fine, but I wasn't sure if everybody in the room kind of understood what had just happened, that we just took the biggest player in music discovery out of the discussion completely because it wasn't fair to the people who developed the codec that encode the music that's on this service. So I will play-- and that said, I play official videos if I can find them. But there aren't always official videos. So let's listen to some Yes, and would you like to pick a format? Do you want to go low to high, high to low? AUDIENCE: High to low. ANDREW SCHEPS: High to low, OK. So we'll actually go down through CDs, because you'll hear a little bit of the difference between the master. So this is the 96 24 taken off the DVD-A, or whatever it was. AUDIENCE: Quick question for you. Are you relying on the digital analog in your Macbook? ANDREW SCHEPS: No, I'm going FireWire to the Orpheus, and the Orpheus is the D to A converter. And it's a great sounding converter. The Prism converters are-- some people say that they're the best converters out there for music recording. In the UK, it's almost exclusively what's used for all the orchestral scoring guys. They'll have 80 channels of the Prism converters. And then, we're just going straight into an amplifier to these speakers. And that's it. Yeah? AUDIENCE: What are you doing to match levels? I'm fudging it. OK, so this is not a scientific test. This is an anecdotal test. Unless I unplug the monitor, which we can do as well, you're going to know what you're listening to. So I'll try and match levels as best I can from up here, but it does vary a little bit. So I'll always make the high res stuff louder, because then you'll like it better. AUDIENCE: How much power are you using to drive the amplifiers? ANDREW SCHEPS: It says it's 4 by 350. So each speaker is bi-amp, so we get 700 watts a side. So I'm barely cranking it. You let me know how loud to go. And I apologize again. Yeah? AUDIENCE: [INAUDIBLE] volume [INAUDIBLE] digital in this thing? ANDREW SCHEPS: In this? No, it's actually an analog control on the output, which is bizarre. That's what they tell me. You can hook it up in lots of different ways. There's an audio path within it. The way it is supposedly hooked up is as analog. But if it is digital, I have to be able to turn it up and down. I don't have a choice. There have been times when I actually had an analog control room section instead, but it was a lot of gear to bring up here. So we're going to use that. Again, everything is going through that. Everything is constant except the files themselves. AUDIENCE: Is it worth turning off the air conditioning, or will that not matter because of the volume? ANDREW SCHEPS: I think we'll get over the top of it, yeah. I mean, again, this is not the most-- here's the crux of this. And I do want to get to music for those of you who have to leave. But the crux of this is that you could set up audio file double-blind A-B tests-- A-B-X tests-- and be really precise about this, and see what you can tell the difference of. But I think especially as we jump from ends of the spectrum, it's not subtle. It's huge differences, and then it's a question about whether it matters to you. I mean, who cares if you can hear the difference? If you like them both, then fine. Then you're good with the small files. I'm not trying to evangelize one particular type of file, or to convince anybody that you have to listen this way, or you're missing out on the music. My theory is that once you get to a certain point, you're no longer kind of interfering in the emotional response. But in terms of an audio file, short burst listening test, this is more fun than anything else, because it takes a lot of work to actually find all these stupid files and put them in one place. So that's the fun of it is I wasted days of my life so that we can sit here and DJ. OK, so that said, let me know how loud-- OK. [MUSIC PLAYING] ANDREW SCHEPS: All right, and here's CD, which-- again, a different master, but it's more to set for when we listen to the other formats. So this will be the same master as all the other formats. OK, so that's pretty different. But it's also a different master. So let's for fun, because it is fun. This is when I'm glad I'm behind the speakers. Sorry. Let's just listen to some more stuff, and then we can talk more, because-- AUDIENCE: What resolution were you playing that at? ANDREW SCHEPS: Well, that would have been-- is it 128 AAC? Because there was no high def video. AUDIENCE: OK, so it was an old upload [INAUDIBLE]? ANDREW SCHEPS: I guess, yeah. I mean, or it's a static artwork upload, so they didn't bother uploading it in HD. AUDIENCE: [INAUDIBLE]. ANDREW SCHEPS: Yeah. OK, so let's do Coltrane. So this is the same master, OK? There have been reissues and things like that of this, but I know for a fact because I got this from Blue Note that this is the same master in all formats. OK, so where do we want to start? You guys tell me. So that's A. We'll do A,B,C. What do you think about that? Or do you just want A, B? AUDIENCE: A, B, A, B ANDREW SCHEPS: Just A B? Well, hold on. A, B or A, B, C? A, B. OK. AUDIENCE: Are you sure [INAUDIBLE]. ANDREW SCHEPS: Yeah, that happens a lot. And this is why, again, we had to stop going to YouTube as any of them, because a lot of them are either swapped, or depending on the transcoding start to collapse into mono. Like the Beatles stuff is mono, but it's not the mono mixes. So yeah, that happens, but that's-- AUDIENCE: [INAUDIBLE] resolution. You can't have very high good placement [INAUDIBLE]. ANDREW SCHEPS: Oh yeah. Yeah, I mean, with the CD. OK, so that was A and B. That was YouTube versus 192. And so again, it was the low resolution possibly transcoded, even though that was an official Blue Note upload. But the problem is-- I mean, I'm sure you guys know, working at kind of a big company, that at some point, someone told the people at Blue Note, OK, now we're going to start doing our official YouTube uploads. And here are all the assets, and go ahead and do it. And that definitely filtered down to an intern who had to sit in front of a computer uploading for three weeks, because nobody who really know what they're going to do, knows what they're doing is going to spend longer than it takes to just point them to the assets. So their official uploads could've been completely destroyed. I mean, it's easier sometimes-- and this happens at HDtracks a lot, where they're sent something that they're told is 96/24 so that they can sell it, but the person who actually sent them the files didn't know how to get over the 2 gig file size limit, and the album was too big, so they just ripped a CD and sent it. And it happens. And then HDtracks gets in a lot of trouble, because there are a bunch of crazy audiophiles at home doing FFTs of this stuff. And also, depending on how it was recorded, there isn't necessarily anything above 20k. But if they don't see stuff at 40k, they're like, that's not high res. So there are lots of problems in the supply chain, as well as just the file formats, which is, again, why this is not meant to be a scientific test, and more of just an anecdote. Now if you want, we can stay away from YouTube, because it is, unfortunately, the most problematic. But-- AUDIENCE: Which one [INAUDIBLE] ANDREW SCHEPS: A was YouTube, and B was 192. Now an interesting thing to me-- with these speakers, I added some low end to tune this room very quickly before I came in. There's some thumping on that side that I'm hearing on the 192 which I don't really hear in the MP3. So you don't always-- like, oh my god, it's just so much better. Sometimes you uncover other things along the way. AUDIENCE: Do you have a non-YouTube [INAUDIBLE] ANDREW SCHEPS: Yeah. We can do-- well, your stuff would be 320. So we could do 320, or we could do Amazon if you want to do that. Let's do Amazon. Well, I just told you, it's Amazon. I'll leave it up. OK, but here's Amazon. AUDIENCE: Can you do it, and then we vote which is which? ANDREW SCHEPS: Yeah. Yeah. Let me just play you some of the Amazon of the Coltrane, and then we'll go to a different song, and I won't say a word. Now I'm crazy, so I did some FFTs of some of this stuff. And one of the things that Amazon does-- because they're only selling 256 MP3s, that's what they sell-- and presumably to help their encoder-- because they're not getting 24-bit files, either-- they actually pretty much cut off everything above 15k. So that's ban limited to 15k on the way into the encoder, because if you don't have to bother encoding from 15 to 20, you've got that much more room to encode below. So that's their decision. Again, now I'm right between the speakers, so for me the imaging is a pretty obvious thing. The 192 is the only one where things are either on the left or in the right. And Rudy Van Gelder, who recorded this album, did not have a pan pot. It was a patch cord. It was either in the left or the right, and that's it. So as soon as you get anything that isn't discretely on one side or the other, you know it's part of the process of the encoding that has made things shift. And that's another way that a lot of the encoders work. And I don't know specifically the ones you use because you're writing your own encoders, if you mono up stuff, , it makes it much easier to encode. It's one audio stream, and it's identical in both channels. So you can save a lot of space doing it that way. And I'm sure that's part of the pre-encoding of a lot of this stuff, especially at the lower bit rates. So that'll happen. And it's not that big a deal on modern pop stuff because stuff is everywhere, but any of the Beatles stuff, all the old Motown stuff, the Blue Note stuff, that is all discrete stereo, and it will change it completely. AUDIENCE: [INAUDIBLE] are you thinking about [INAUDIBLE] ANDREW SCHEPS: No. No, I refuse to. So here's my theory, is that I need to make my records sound as good as I can make them sound regardless of what happens afterwards. So then, when I realized what was happening afterwards, I asked the Recording Academy so let me come and talk about it. And they said, sure, we've been trying to figure-- because they've had this Quality Sound Matters initiative officially for a little over a year, but unofficially for the last 10 years. And they've had ideas about-- we're going to get buses and put awesome sound systems in them, and we're going to drive them around and play this stuff for people, and trying to come up with ways to let people here what the difference is so that you can start to understand. So when I came up with this presentation as a way to do it, they were all over it and have allowed me to come and do it. So my idea is to find out what's actually important, and change it. I refuse to live with the crap, and just say, I got to make it work on earbuds, because in five years, it won't be earbuds. And the pipes will be bigger, and you guys will flip a switch, and it's going to be either uncompressed or barely compressed. And so now, I've changed my whole workflow to cater to something that goes away. And it's one of-- not to talk about your neighbors-- but it's one of the biggest problems I have with Apple conceptually, is that they will talk a lot about what they want to get from the labels and from the artists in terms of their ingestion, and they want 24 bit, and they want the high res. But if I master specifically for their encoder right now, in three weeks, if they say, bandwidth is awesome. We're going to start selling 320 AACs. Well, now it's a new encoder, or they just update their encoder. All of a sudden, I'm making decisions based on things that go away. And I think it's a very big difference between the record making process and the consumer distribution world, and you can't make records for the consumer distribution world other than a lot of the analog limitations we used to have to deal with. Like you can't pan your bass off to one side if there's a lot of low end, and still cut vinyl. OK, like their physical limitations to things which I'm fine with. And AM radio-- they shave off the top and the bottom, and it's mono. OK, that's fine, I know what's going to happen. But in terms of taking some sort of encoding algorithm that's constantly being updated-- otherwise, some people in this room would be out of a job-- I can't work for that because it's a moving target. So my idea is if I make it sound great, it will survive the process better. And that, I've actually found is true. Like this Blue Note stuff sound so amazing and so natural that you can start to hear things get hashy and it's a little more annoying and a little brash, and the panning isn't as wide. But musically, it's still pretty awesome, and it's OK. And it survives better. And strangely, a lot of the urban music survives better, because there's lots of separation between the instruments. Things are very discretely encompassed in terms of their frequencies and things like that. They're not sharing a lot of space. You don't have 15 microphones on a drum kit that are all making noise. So that actually translates better. And strangely, there is zero hip hop or R&B that I was able to get, other than the Espreranza Spalding record, in high res. It doesn't exist. CD is as high as it goes. They turn in masters that are 44.1-16, because they're building it on a laptop. And they're actually building their tracks with MP3s. AUDIENCE: [INAUDIBLE] compressed not in bytes but making the lowest part of the music-- the softest one-- high. So if people start doing that [INAUDIBLE] what's the point in going to high res? ANDREW SCHEPS: Well, I would argue that even something that doesn't have a whole lot of dynamic range, you will still absolutely here the difference when you have a very lossly encoded file. You start to destroy things other than just the dynamic range, right? There's frequency content, there's panning content, there's the mono versus stereo content, there's depth of field. There are all of the cues that are being taken away, all the acoustic cues and reverb tails and things like that. And that will affect it even if it's super loud. I mean, there's this whole thing called the loudness war, which maybe you know about, but they just like-- I won that war, OK? I mixed "Death Magnetic," which was the album that everybody said was the poster child for things being way too loud. OK, so I won. Therefore, the war is over, we don't have to worry about it. [LAUGHTER] ANDREW SCHEPS: I spent weeks reencoding for iTunes and Amazon at that time to make those files work lossly encoded. So what happens is you start to get rid of dynamic range and things like that, is you start to break the encoders. The encoders need some room to work. So I'm making it very difficult for that to work. And one of the things we found that worked great was turn the mix down 0.7 db, period. Just let there be headroom that we never even use, because it's brick wall right there. We never get up to that last 0.7, but all of a sudden, all of the encoders sounded about 100 times better. When we got to give them 24-bit files for the last Chili Peppers record-- that was right at the beginning of the mastered for iTunes project at Apple-- and the big crux of that project is give us 24-bit files instead of 16-bit files. That made a huge difference. So in terms of what you feed the encoder, it isn't just about the source material in terms of a sonic thing. Because I think there are lots of hardcore and punk albums that, from a sonic audio file point of view, sound terrible. But they are so super exciting that people love those bands and they want to listen to them. And if you do a 128 MP3 of that album, what used to be hashy and exciting is now just hashy and noisy, and I think there are lots of people who wouldn't get into the band as much as they would even if they buy it on a cassette, which doesn't have anything above 12k on it, or something like that. So there are two very different aesthetic paths you can take when you talk about the music. And the problem is, it's not like with TV. Right, with TV, who is going to argue that a high def set looks worse than an SD set? Because you see it, and it's easy to A-B. Some people like the artifacts and you're used to things like that. And if you have a bad digital set that pixelates, there can be issues. And if you look at bad material on an HD set, it looks terrible. OK, so all those arguments are true. But let's say you have a well-captured still image, and you show it on these two different TVs. One of them has way more information about it and it just looks a hell of a lot better, the other one does not. Whereas with audio, people don't trust what they hear. People think you have to be trained to like something better when you just talk about audio formats. And people believe what they're told, period. I mean, nothing influences your opinion about things more than me telling you how great it is, right? If someone's about to play you something by a certain band and you like them, and they say, I can't stand this band, check it out, you will not like that band. If they say this is my favorite band in the whole world, you're going to try really, really hard to like that band because you like that person. So there's so much that goes into liking music that has nothing to do with any of this, but it also has everything to do with it, because I really believe that there are just thresholds. And for every person listening to a new piece of music, there's a threshold of, am I going to like it? Am I not going to like it? And the more you can give them something that sounds true to whatever the artist decided was done, the lower that threshold will be, and the easier it is to connect. So regardless, let's listen to some stuff, unless you want to keep talking. AUDIENCE: So when did you do that, the Death Metallic? ANDREW SCHEPS: The Metallica mix? "Death Magnetic"? That was-- I don't know, six years ago, seven years ago? AUDIENCE: What made you [INAUDIBLE] ANDREW SCHEPS: What made me destroy it? AUDIENCE: Yeah. ANDREW SCHEPS: OK. That is a conversation that is not-- I mean, really the only thing I would say about that is I have nothing to say about that. The idea that me as an engineer could mix a record in such a way that was destroyed, but everybody would be OK with it and let it out into the world is just crazy. There is a band involved, there are producers involved, there are plenty of people involved who said, this is awesome. Now during the process, whether or not I made quieter mixes to A B,and an let them hear differences and whatever, I may have done, but it's irrelevant. It's irrelevant. What happens is at the end of the day, that album sounds the way it does because that's what the band and the producer thought was great. And there's some people who really don't like the way it sounds, but there are a lot of people I've talked to who think it sounds awesome. It's super aggressive. It's not the most hi-fi thing in the world, but a lot of stuff I do is not hi-fi. But I hope that it's emotionally awesome, and makes you love it, and makes you want to either kick a hole in the wall or cry or call your mom or whatever it is that we're trying to get across. So this discussion in terms of what you do with that file afterwards is also very different from the audio quality in the sense of audio file. There are lots and lots of records that if you go to one of the big consumer electronics shows where they have a million dollar set-up where a speaker this size will cost you $85,000 each, and has iridium tweeters. And you've got a stand for the turntable that costs more than your house-- that kind of thing. You can only listen to audio file stuff on there, right? And so what are you going to hear? You're going to hear a few jazz records and Steely Dan, and that's kind of it. And those are great records, and they're also amazing sounding records. But if you put on something like the Metallica record on there, at that point, maybe some of that's wasted. But it's not because you're putting on a low bit rate MP3. OK, another thing just anecdotally-- and we will listen more. I'm sorry. I will talk about this for days. But while I was putting all these files together, I had this massive folder of files, and I'm keep things organized, and making sure things are named. And I was just listening on my laptop speakers. First of all, I'm letting the OS do sample rate conversion in real time. Right, whatever Quicktime has, that's what happened, so it can play back at whatever sample rate the stuff was set to, which is probably 44.1. And I'm just listening to the first 25 seconds of each song, making sure they're all the right song. I can tell the difference in my laptop speakers. So I bring this set up because it's cool, and we've got a room this big. And if I played stuff on my laptop, no one can here it. So this helps. But if you have any sort of decent kind of system-ish that has some good DSP on the back end to make it sound pretty good, and it's got a little bit of power so some of the dynamics come through, I think you absolutely will hear the difference. And even more than that, you'll feel the difference. One of them is just more fun to listen to. But that's a discussion that could go for weeks, and there's no necessarily right answer. But the good thing is, I won the war, so the war is over. So now we can all make quiet records again. Yeah? AUDIENCE: So there's a new standard from the ITU to set record loudness levels. Are you following that at all? ANDREW SCHEPS: Well, what those are as far as understand, and correct me if I'm wrong, that's what's used in the Apple Sound Check, as well, where you scan a record to say how loud it is, and then it uses it to even out the level if you take advantage of that in whatever playback system you're using. Is that-- OK. So basically-- AUDIENCE: [INAUDIBLE] a little different from the ITU's standard. So there's different, competing implementations. ANDREW SCHEPS: Again, I don't. I mean, if we got into my mix process-- which I could talk for a different set of hours about that-- my mixes are what sounds good. And sometimes, the level of the mix really doesn't matter. But a lot of times, it does. And I mix on analog equipment which has voltage rails. So as I hit that rail, I don't just cut it off. It smooshes it off, and it takes a while to smoosh it off completely. And different amounts of that smooshing differ. And it's just because I'm in the analog world, so clipping and harmonic distortion are your friend until they're not your friend, and something catches on fire. So when I'm mixing something like the AFI record I just mixed or Black Sabbath record, those mixes are going to be loud because they don't really sound right until their loud. But when I make something like [INAUDIBLE] which is on my label or jazz record-- I mixed a Jeff Babko record last year-- those end up being much quieter mixes, because I want it to be more open, and the dynamic range really helps the music. So for me, it's much more a feel thing. And then I find out later that I've kind of screwed up, and the mastering guy gets angry. And then I will send the quieter makes and say, if you get it to sound as good as my one that you say is too loud, then we're good. But if it doesn't feel as good, then we have to go with my screwed up mix. So I'm not the best person with that. There are a lot more technical mixers than me who adhere to things more than I do. I'm kind of a disaster with that. Yeah? AUDIENCE: What's your take on [INAUDIBLE] Pandora, [INAUDIBLE]? ANDREW SCHEPS: OK. So streaming, I mean, the filetypes are the same, right? And on that chart, I had bit rates for the streaming files. So I have no problem with streaming versus download. I mean, there's a whole other conversation which is about making the music business still exist. And that's actually a really important conversation, and encompasses way more than just this. This is the esoteric, I think this makes a difference part. Then there is the recording album credits part, which is a discussion I'm hoping we're having tomorrow a little bit-- implementation of that, getting consumers to interact directly with artists more, because that's what creates the relationships that last so that I don't have to go get another day job. That's my goal in all of that. In terms of just the audio, though, the streaming and not is exactly the same thing. So actually let me plug the monitor back in. And let me show you one other thing, which is a technology. It's called OraStream. Does anyone in here know about OraStream? So we've got one, because you were there last time. Does anyone here know about the MP4 SLS format? It's another Fraunhofer encoding format. So it's meant to be an archival strength format. So what it does is it will wrap audio in its own metadata, and preserve whatever the native bit rate and sample rate is of that audio. But one of the byproducts it has is you can do what they call truncating of the stream to produce in real time any bit rate stream you want. So what OraStream have done is they've come up with all of the server side and back end technology to do pinging of your connection in real time, and to granularly scale. So the Google Play Music-- you've got three bit rates. You check out how fast people are able to get the stuff, and you give them the fastest when you think they can get without any buffering, right? Because buffering sucks. No one wants their music to stop But that's what you do, right? And you will skip between those levels. So if when you start playing a song, you're in a black hole, even though you're listening on your cell phone. You're in a parking garage. You're going to start off at a very low bit rate. Now are you constantly pinging, and you'll up the bit rate as soon as you can? Or do you wait for the next song? AUDIENCE: You want for it. ANDREW SCHEPS: You wait for the next song. OK. And this is technology-- these guys, I mean, they probably had meetings here, I don't know, with anyone in the room. But originally, it's a few guys from Singapore who developed the technology. And they were hoping someone else would just license it, because they thought it was awesome, and why wouldn't people want to do this? So what they do is they're pinging constantly, and the bandwidth will change. And it plays back in HTML5 using a WebSocket, and it plays back on iOS and Android via an app, because MP4 SLS isn't supported directly in the OS of anybody's computer yet. So let me just quickly go to my account here. And for audiophile people, by the way, this is an awesome service. So as a listener, it's like a Dropbox that can stream your audio to you. So you can get a free 1 gigabyte account, I think it is. Or you can pay $5 a month for 5 gig. You can pay a little bit more for 10 gig or 50 gig, or something like that. You upload your lossless music to the service, and you can immediately stream it anywhere in the world on any platform. So it's their version of a cloud iPod. But here is what it's awesome about it. So let where are all of my playlists? Here, let's stream something that's kind of-- oh, here we go. Come on. OK, so I've got some of the same songs here. But here's what's important is see it right up at the top, below the scroll. What do you call it? What's the official name for that, the progress bar with the thing in it? You know, it's the position bar thing. OK, so watch what happens. So everybody heard the song come out from under the water, and start sounding good? Here's one that is not a hi-fi recording. Oh, this is a band from Austin who are the most exciting show I've ever seen. And I signed them to my label. They made a record in two days. I mixed it in one day. It's psychedelic rock stuff. It's not the most hi-fi thing in the world, but this is at 96/24. And again, just watch the bit rate if you can see it. So we're going to start off at 128 because there's a cache. OK, so we're just streaming 96/24. And if you do the math and figure out the bit rate, the number will always be a little lower, because the last part of the decoding happens at the WebSockets, so you don't actually need to give the full bit rate. So the drawback is if you compare 256 stream from MP4SLS to a 256 encoded MP3 or AAC, the MP4SLS will not sound as good, because it's not optimized for that bit rate. But I've never had to listen to 256 with this. Wandering around on the 4G or 3G that I get off AT&T, I'm CD quality all the time. And as you go from the cell network onto your wi-fi, it jumps up. And it's seamless, and it works in real time, and it's awesome. So this is another example of, I think, where stuff can go where you still get the convenience of things having to start playing immediately, which I totally get. You don't want to start streaming CD quality audio to people on crappy cell connections. But if you can hit Play immediately, then realize they're not on a crappy cell connection and be CD quality within the first few bars of a song, and when they jump on a wi-fi network, be up at audio file quality, that's pretty cool. So hopefully, this is sort of where some things will get headed. And it's one of many possibilities. But if anyone's interested in talking to the or guys, please get in touch with me, because they've set it up where now it's a lockbox service for people who want to just upload their own stuff. I can sell my artists' albums through there, download as individual apps. So they have a business model, but they're also always looking for partners. When Neil Young released his last record, and everyone has heard of the Pono system that he's touting, which is a hardware-based high res audio system? The Warner Brothers wanted to stream his record for a week before it came out, because that's what record labels do now is give you a free stream. And he said, yeah, that's fine, as long as it streams at 192/24, which of course, that's not going to happen. So they got the or guys to do it, and they actually did it. And they were streaming about 5 terabytes an hour all over the world of people who wanted to listen. And if they were on their mobile browser, they were probably getting maybe CD quality. But if they were on a computer hooked up to a stereo, they could listen to his album at 192/24. And again, granularly scaling, so if there's any little bit in the traffic, or if your buddy starts streaming a movie down the hall, you granularly dip, so it's not a stepping dip. So in terms of the listening experience, it's a lot less intrusive, because you dip down and come back up. Anyway, so that's OraStream. Yeah? AUDIENCE: There's one form that you haven't mentioned a single time. I was wondering [INAUDIBLE] DSD? ANDREW SCHEPS: OK, so DSD, just really quickly, is basically 1 bit encoding at a megahertz level. So instead of taking this grid and putting it over, many, many, many more times a second then you would on a PCM encoding, you say, what's the voltage? Is it higher or lower than last time? And you use your 1 bit-- this is the dumb version-- say, yeah, it's higher, it's higher, it's higher, it's higher, now it's lower, it's lower. So you're basically tracing the waveform very, very quickly as it goes. The only problem is-- the reason I don't mention it is because until about a week ago, there was no viable consumer format. And now there is one site that is actually selling DSD audio files that you can download. And it's even more cumbersome to get a player to work. Now in terms of audio quality, listening to DSD versus high res PCM encoding, I haven't gotten to do A B test, but a lot of people love it, think it sounds absolutely amazing. It's a very different way to encode music. It's awesome. I try to only cover established consumer formats during this, because that's what's out there. And there's no way I can distribute anything DSD right now. It's impossible. AUDIENCE: And it would be hard for you to edit it ANDREW SCHEPS: It's almost impossible. There's one system that allows you to do multi-track editing, and it's really expensive, and their software sucks. So I can edit, but it would not be good. So again, obviously, there's always the ability to work versus what would be best. [APPLAUSE]
A2 audio andrew music people record stuff Andrew Scheps, "Lost in Translation: Audio Quality in Streaming Media" | Talks at Google 419 19 阿絑 posted on 2014/10/04 More Share Save Report Video vocabulary