Subtitles section Play video Print subtitles JUSTIN: I'm Justin, tech lead for the WebRTC project at Google. And today, we're going to talk about using audio in the web platform. The web platform has come a long way on both mobile and desktop. And using technologies like the Media Capture API, Web Audio, and WebRTC, it's now super easy to record, process, and send audio and video in your app. To demonstrate this, we're going to try something we've never done before. We're going to use this session to record a theme song for Google I/O. And you're going to help. So I'm pretty jazzed about this. I played in some bands back in the day, a little keyboard, a bit of bass. And I remember what it was like to make music back then. To get that perfect sound, we had to tote around a whole bunch of gear, all sorts of effects pedals-- overdrive, chorus, flanger, wah-wah-- amps and EQs to get the right tone, and a four track to record it all. And that was a big investment for us. But that's what it took to sound like real rock stars. So we'd all have been blown away to know that this whole basement of audio processing gear would soon be able to be replaced by a web browser. In the modern web platform, we now have the power to do incredible things with media on any device. And one great example of this is a new third party app called Soundtrap. Built on Web Audio, WebRTC, and Dart, this app is like a set of instruments, a recording studio, and a musical collaboration tool all rolled into one. And we're going to use it to record our song. So musical inspiration can strike you at any time. So imagine I'm just sitting on the train with my Nexus 7, and I get the idea for a great song. So let me show you how this works. I'll plug in the Nexus 7. Let's make this real. So I'm running the Soundtrap app in Chrome for Android. And the first thing we need is a driving percussion track. So let me go ahead and check out the drum loops we have here and figure out what's the right thing for our song. [BEAT PLAYING] Mm, let's see what else. [BEAT PLAYING] I think I want something that has a bit more rock. [BEAT PLAYING] Yeah, OK, that's the one. So let me just tap that and add that to our song-- perfect. OK, so we've got our percussion track. But now, to complete our rhythm section, we need to get a bass line going. So fortunately, I've just arrived home, where I've got my bass. And now, using the Media Capture API, also known as getUserMedia, I can record my instrument right inside my laptop. So let me show you how this works. I'm going to plug in here just using a standard cable right into the microphone jack on my laptop-- no special hardware or cables. And there's my drum track I already added from before. All right, I got a sweet bass amp. [BASS PLAYING] Let me just make sure I'm in tune here. [BASS TUNING] All right, perfect. All right, let me just play a couple notes to get started. This one might sound kind of familiar, although our attorneys have told us that it can't sound too familiar. [BASS PLAYING] All right, how does that sound? All right. [APPLAUSE] This is sounding great. And this is simply incredible to have this kind of power in a web application. So let's go ahead and record a bass line for our song. So right here in Soundtrap, I'm just going to start recording. And I'm going to add this bass line right on top of the drum part that we had before. [BASS PLAYING OVER DRUMS] All right, I was a little bit early on that. So let me try that one more time. [BASS PLAYING OVER DRUMS] All right, so there we've got our great bass line. Let's save that out. And that's what we've got for our bass track. So onto our next part-- let's talk about the technology behind what we just did. So in the Media Capture API, the key method is called getUserMedia. And this call opens up the microphone-- oops. This call opens up the microphone, asking the user for permission the first time, and gives us a Media Stream object that represents the audio coming from my bass. Now, the same getUserMedia API can also be used to get access to the microphone for speech or a webcam for video. Now normally, when we call getUserMedia to get access to the mic, the mic signal is going to get routed through a bunch of processing stages to make sure we get a good speech signal. Now, we're also going to apply auto gain control to make sure the levels are correct, as well as acoustic echo cancellation to make sure any of the sound coming out from the speakers doesn't also get picked up by the mic. However, when we're recording an instrument, we don't want any of that processing to occur so we get as pure of a tone as possible. So when we call getUserMedia, we're going to turn that off. So here's a typical call to getUserMedia. We make it on the Navigator object. And we ask, in this case, for a media stream that contains audio from the mic. We'll get our stream back asynchronously. And then, we can actually do something with it. If the syntax looks kind of unfamiliar here, this is making use of Dart's Future concept. The .then at the end of the statement allows us to say easily what should get executed when we get our stream back. So when we call getUserMedia, we're now going to specify the parameters we want to the audio stream. And here, we're going to ask for audio and specify we want the AEC turned off-- simple. So this way, it's what we're doing inside of Soundtrap to get access to the microphone to record bass, guitar, or any other instrument. And now, let's turn it up a little bit. Because Media Capture is really cool. But it totally comes to life when we connect it with a Web Audio API. And that's where we can do some amazing audio processing. So to demonstrate this, let me welcome Per on stage. Per is one of the co-founders of Soundtrap. [APPLAUSE] Per. Is one of the co-founders of Soundtrap, and also a bit of a guitar hero. PER: Hi, Justin. JUSTIN: Great to have you here on stage, Per. PER: Thank you very much for having us. JUSTIN: So do you think you could play a couple guitar riffs for us? PER: Maybe I can do that. I have a guitar [INAUDIBLE], so let me mic up. JUSTIN: So Per is going to play some guitar for us. And we're going to use that to demonstrate the effects that are possible using Web Audio. PER: So we're using Web Audio to build up a whole set of effects. And I'll show you some of the presets that we have if I get some volume, yeah? So this is just an acoustic sound with no effects on at all. [GUITAR PLAYING] Like-- and then we can do something else. JUSTIN: All right, let's go a little harder. [DISTORTED GUITAR PLAYING] We call that one-- [APPLAUSE] We call that one "Fumes on the Ocean." So let's kick it into overdrive. [OVERDRIVE GUITAR PLAYING] We call that one "Insane Bus Ride." Is that right? PER: Yeah, something like that. JUSTIN: OK, all right, now let's take it back to the origins of rock and roll. [SLAP DELAYED GUITAR PLAYING] OK, ladies and gentlemen, Per. [APPLAUSE] So to have this kind of processing power on the web-- simply fantastic. Let's talk about the magic behind these effects. PER: Yeah, so let's talk about this simple one, actually. So this is called a slap back effect. [SLAP DELAYED GUITAR PLAYING] Exactly, that's right. JUSTIN: So most of you are probably familiar with the HTML5 audio and video elements that load, decode, and play out audio and video. But Web Audio goes beyond that and gives you super low delay and precise control over each one of these steps, including scheduling sounds down to sampling level accuracy. And this time is particularly important for games, as well as pro audio apps like Soundtrap. Web Audio also allows powerful synthesis, processing, and routing tools. So this lets us do all sorts of great stuff, everything you see here-- synthesizers, effects, visualizers-- everything you need for pro audio. So from the API perspective, Web Audio is based on a simple concept-- the audio pipeline. And in the API, we refer to this pipeline as an AudioContext. We can build out our pipeline by taking inputs either from a file, microphone, or synthesizer, and processing it using what we call nodes. And these nodes can do various different things. They can apply gain. They can apply delay. They can even send the data to JavaScript, where JavaScript can change the sample values. Finally, we send it all out to a destination-- typically the audio output. Now I'm going to hand it over to Per to explain how we get from this to this. PER: Yeah, so now I'm going to show you the slap back. So that was a very simple effect. So you can hear it here. So there's a direct signal, and then one small delayed signal after that. [SLAP DELAYED GUITAR PLAYING] OK, you can hear that? So that can be represented in a diagram like this. So we have the audio coming from the guitar, and then routed directly to the speaker. That's the up line. And then, we're adding one delay signal. And then we gain that. We lower the volume a little bit on that one. So then you will have that very simple effect, yeah? So the good thing with Web Audio is that it makes it very easy to actually program effects like this. Because you have all the different types of nodes you would like to combine. And you can do that in very simple ways. So I will just show you the code here that you can use to implement [INAUDIBLE] like this. So first, we need to get the context. We talked about that. So you're getting the context. And you will use that object to, for example, create the nodes later on. And then we're also getting the user media. So that was what we just showed before. So once we get the user media back, we'll start by actually creating the guitar node here by using the Context object, and then just call createMediaStreamSource on that stream that we returned from the getUserMedia. And then, we also get the reference to the speaker. So then, we have our two nodes to connect. Then, we just hook the first two up together. So we say, guitar connect to speaker. So then we have the direct path here. Then, we're using the context again to create a delay. So that's all in the Web Audio API here. So we create the delay node. And then we hook the guitar up with the delay node. And then again, we use the context object to create the gain node. And then we hook the delay up with the gain node. And then we connect the gain node with the speaker. So it's not many lines of code you need to actually create an effect like that. Then, of course, that's one of the more simpler ones. So we have much more complex ones that we show. So that's the whole snippet you need, OK? JUSTIN: OK, so now we're going to add a lead guitar part to our song. And to get a great guitar sound, we're going to combine a whole pipeline of the effects that Per just demonstrated. We're going to add delay, chorus, compression, and fuzz to create a truly sweet tone. So let me show you how that works. [GUITAR PLAYING WITH EFFECTS] PER: Yeah, something like that. But what we're going to do now, we're going to play some melody on top of our song that we're building up. And I would like you to try to remember that. Because you could be asked later on to actually remember it, just to let you know. JUSTIN: All right, that's a great sound. Per, give us a killer melody. PER: Yeah. JUSTIN: Here we go. [TRACK PLAYING] [GUITAR PLAYING WITH EFFECTS] JUSTIN: All right. PER: Simple, yeah. [APPLAUSE] JUSTIN: That was rocking. PER: Think you can remember that? Try to remember that. JUSTIN: All right, we've got all our parts right here. OK, so the next thing we need for our song is a groovy keyboard part. And unfortunately, we don't have any keyboard players hiding backstage. But Per, do you know anyone who plays keys? PER: Actually, I do. As a matter of fact, our founder and CTO Bjorn is rather good at keyboard, actually. JUSTIN: Huh, do you think he could play for us? PER: So what if I call him up? JUSTIN: Well, let's do that. So over in Soundtrap, we have the ability to invite Bjorn to our song. Let's see if he can help us out. Soundtrap is located in Stockholm. So I guess it might take a little while. PER: So Bjorn just joined the chat. JUSTIN: OK, great. Well, let's see if he can help us out here. PER: Hi, Bjorn. BJORN: Hi, Justin. Hi, Per. PER: Hi, how are you? BJORN: I'm fine. How are you? PER: Very good, thanks. JUSTIN: So Per tells me you play a little keys. And that's just what we need for a song. We're here at Google I/O recording a song, a theme song for Google I/O. Do you think maybe you could help us out? BJORN: Well, you know, I have this MIDI keyboard here. So I can give it a try. It's just a regular MIDI keyboard. And it's connected to my laptop using a USB cable. So I'm going to be recording some organ here. And this is all done using the Web Media API. So let's see here, I'm going to be adding a track. Let's see if we can get some-- [ORGAN PLAYING] JUSTIN: All right. BJORN: So did you hear that? JUSTIN: Yep. BJORN: Cool, so here we go then. I'm going to record something. [ORGAN PLAYING OVER TRACK] Now let me just save that. JUSTIN: All right. PER: Very good, thanks. BJORN: So there you go. [APPLAUSE] Thank you. JUSTIN: So that was really great. Do you think maybe you could help us out a bit more, maybe add a few backing tracks to the song? BJORN: Yeah sure, why not? Let me give it a try. JUSTIN: OK. Well, Bjorn is also in our project. And so it's in the cloud. And we'll be notified when his updates are finished so we can check them out. All right, so let's talk about what we just saw there. We did collaboration onstage. We did collaboration with WebRTC. WebRTC is all about communication and collaboration, the ability to connect with anyone, any time, anywhere. And to be able to make music together collaboratively, this is a perfect use of WebRTC. Now technically, WebRTC provides apps with an API to establish a real-time peer to peer connection. And in using that connection, you can send audio, video, or arbitrary data. Now, the API that we used for making this P2P connection is called, naturally, Peer Connection. And if you take the media streams that we got earlier using getUserMedia, we can now attach these to a Peer Connection. And the Peer Connection will take care of the hard work of setting up a peer to peer connection and streaming that audio and video across the network. And this all works great, even on the real world internet, as you heard from Bjorn. Now Soundtrap, for the musical collaboration you just saw, they needed a few more extra features from WebRTC. So Per, can you tell us more about that? PER: Yeah sure, so there's like three things we need to be able to do this in a good way. So the use case for us is that we will need to be able to hear what he is playing, of course. We need to be able to speak with him. And we need to be able to see him, right? So we need to have a video chat functionality. We need to have high quality stereo audio. And we need to have the input that was already kind of processed in Web Audio. So if I were playing with my guitar, I can have a multitude of effects added on. So I need to have that also heard over the network. Bjorn here was playing our synthesizer. And that is also done using Web Audio. So we heard that, the organ. So here's kind of the setup we need. So we need the input from the upper part here, which is the stuff we already showed you, and the effects or any synthesizers or anything created using Web Audio. We need the input from the actual computer mic. So that needs to be mixed in with the other stuff coming from Web Audio. And then, we need the webcam stream. And we put that into the Peer Connection. And then the magic happens, yeah? So let's look at the code for that. So we're using the getUserMedia API again here. But instead of fetching the audio, we're fetching the video part. And when that is returning, then the then clause here is executed. So the first thing-- if you see in the bottom here, we have actually received an access to our webcam stream. Then, we are taking the musicOut object, which is the output and the mix of everything we've done in Web Audio, including the mic, and we create a MediaStream destination using that musicOut object. And then, we're getting the stream for that. So then, we have our stereo music stream. Then, we are hooking up both the stereo music stream and the webcam stream to the Peer Connection. So this is also Dart syntax, so we can use the dot dot notation here. And once we have added the stereo music stream and the webcam stream to the Peer Connection, we just start to call on the Peer Connection. And then they start to talk to each other, the two peers, right? JUSTIN: Very cool. PER: Very good. JUSTIN: So that's all about Peer Connection. So now let's go see if Bjorn is finished adding those backing tracks. It seems like we're waiting for a second on those backing tracks. So while we're waiting, a proton walks into a bar. He sees an electron sitting there and says, I feel there's an attraction between us. PER: Yeah, here it is. JUSTIN: All right, so now we've got these backing tracks. Let's go and check out what our song sounds like. Here we go. [MUSIC PLAYING] JUSTIN: What do you think? [APPLAUSE] PER: Quite cool. JUSTIN: OK, but we're missing one thing-- the vocals. And so we've collaborated on stage. We've collaborated across the internet. And now, we're going to do some collaboration with you. So you're all going to sing the vocals for the song. And I'm going to go out in the audience and record it. And to do that, I've got my Nexus 5 right here. And I'm logged into Soundtrap using Chrome for Android. Now, Per's singing voice is a little bit better than me-- classically trained. And so he's going to lead you on this. Per? PER: Yeah, yeah, sure, sure. So I told you to remember the melody part. So we're actually going to put some lyrics on top of that. And you may imagine already, so it's Google I/O, Google I/O, Google I/O, Google I/O four times, OK? So it's quite easy. JUSTIN: If you forget, it's like right here. PER: So what we'll do is you've heard the song now a couple of times. So we'll do a rehearsal first. And when we rehearse, after the second time we have sung it, we'll actually turn down the volume. But you should continue singing anyway with me, right? And then we'll do a take after that, promise. Sing from your toes, as we say in Swedish. [SPEAKING SWEDISH], yeah, OK? OK, let's try it. Should we-- JUSTIN: All right. PER: Play, yeah? JUSTIN: Here we go. PER: I'll take the practice round first. JUSTIN: OK, here we go. PER: One moment. [MUSIC PLAYING] [SINGING] Google I/O-- help me out-- Google I/O, Google I/O, Google I/O. So that's how it's going to be. That was a practice round. Now you know. We're going to be-- we'll try to record it on the Nexus 5. So you can lean over to that one as well. Because there's the mic. Let's do it, then. JUSTIN: All right, that sounded pretty good. But I think they can be a little louder. PER: Yeah. JUSTIN: That was maybe singing from the knees, not quite the toes. PER: Yeah, yeah, exactly. JUSTIN: All right, so one more time. Let's do it. [MUSIC PLAYING] AUDIENCE: [SINGING] Google I/O, Google I/O, Google I/O, Google I/O. PER: Great. [APPLAUSE] Very good. JUSTIN: You guys were fantastic. PER: Definitely. JUSTIN: That was definitely all the way down to the soles of the feet. All right, so we're now going to kick off the final mix down for the song. And so while we wait for that, let's talk about what Soundtrap needed from an application platform. In building a cloud-based musical collaboration app, Soundtrap knew they needed a platform that was familiar to their developers, and yet would scale as the project became more complex. And as a small startup, they also want to make sure their developers could be as productive as possible. So I want to talk about these in turn. This is why Soundtrap chose Dart, a new platform for scalable web engineering. Let's look at what Dart provides that makes it so attractive. First, familiarity-- Dart feels like JavaScript. Semicolons, curly braces for a reason, classes, single inheritance-- everything familiar to people who are using JavaScript or other languages like Java or C++. And because of that, Dart was designed explicitly to be easy to learn. Google's run numerous internal and external hackathons. And everybody picks it up super fast. Now, the Dart team also took the opportunity to go a bit beyond JavaScript to clean some things up. While Dart is familiar, it also makes things easy and terse. So here, we're declaring the user class with two fields-- username and password. This code sample should look very natural. We can also add things like name constructors. You can create different constructors with different names. And it's very clear what they do. We can also add getters and setters when you need them. And we can do a simple getter in just one line. You can look at this fat arrow syntax where we check, is this username valid? Super easy to write single line functions and methods. Now let's address scalability. Typical web app development starts out as small scripts and grows to very large, complicated apps. But we want to make sure as that app grows from 100 to 100,000 lines of code, the environment is there with you to help you scale up. So Soundtrap started out just like this-- few files and scripts. And they grew over time. But they made use of stuff like Dart's libraries and packages to keep things organized. As the app grew in size, the Soundtrap engineers didn't have to slow their productivity. Here, Dart can bundle code into libraries. And libraries are like modules of code. And from the dart:async library, we're going to pull in the Future class. We've seen futures before when we used getUserMedia. And futures are these objects that represent things that don't exist yet. But they will in the future, get it? So they're much easier than call backs. And so also, like other structured languages, Dart classes can have static methods. Here, our load method is going to return a future that will resolve when the back end service completes its query. Last, let's talk about productivity. Soundtrap is able to be productive because tools like the Dart Editor and the Dart Analyzer are able to statically analyze the code. Because of things like static typing, they can provide real-time feedback when you do something wrong. Here, we're notified immediately by the Dart Editor that we misspelled a field. We don't need to go run our app or fail our unit test to know that we did something wrong. The tools tell us proactively. And as the code grows to many packages and libraries, it's really hard for a developer to keep all of the APIs in their head. And the Dart Editor comes in handy there. The Dart Editor gives this real-time code completion like any sort of modern, mature IDE. Last, because of the static typing and static [INAUDIBLE] analysis, the tools know what types the objects are. The developer can always stay in the flow. So Dart-- familiarity, scalability, productivity. Now let's talk to Per. You guys worked with this. Can you talk more about what specifically Dart helped you do? PER: This is one of many examples, of course, why Dart is so important for us and good for us. So we've seen it now a couple times. We need to load the project and the tracks in the project. And it's a cloud-based service. So we would like to do that as fast as possible because we synchronize in between the devices all the time. Asynchronous program is a little bit tricky in other languages, like for example JavaScript. But it makes it very easy using the Future mechanism in Dart. So let's say that we have a project with three tracks. So we define a method loadProject, which returns a future. And then, when the HTTP request here actually also returns a future, when that is done, then we parse to JSON. So that is what is done in this method. Then, we define a second method to load the audio and do exactly the same thing. But we decode in the audio data, the blob. So then, we hook the two methods up. So first, we load the project. And when it's executed loading the project, and then we have our kind of information about the product and the tracks, then we return. We will have the project instance here. So we can work on that. And then, we're using something that's called a wait method. That means that everything-- so you can pass in a list of futures. And when all the futures have executed, their job actually continues. So in the Future.wait method, we pass in a list of futures from the loadAudio method. And that can also be done very neatly in Dart here. So we have the project instance. We're calling the tracks. And we're using something that's called a map, which actually collects all the futures from the loadAudio method calls into a list and pass that into the Future.wait. So it's quite a very dense way of doing asynchronous programming. OK. JUSTIN: So Soundtrap is a great demonstration of the modern web platform using APIs covering media capture, audio processing, real-time communication, and more. But these APIs aren't just for Chrome. These APIs belong to the open web. And they're available on multiple modern browsers on both mobile and desktop. So for example, Web Audio is available on over 2 billion devices, of which over a billion are mobile, including Android and iOS. WebRTC-- over 1.5 billion browsers and 300 million mobile devices. We're seeing great growth across the WebRTC ecosystem on both desktop and mobile with multiple major players shipping apps built on WebRTC. Some examples-- Snapchat's video chat, powered by WebRTC, TokBox, and Amazon's Mayday service, all powered by the WebRTC platform. I've also been in touch with all the major WebRTC app developers. And there's going to be several more major announcements later this year. Finally, we're super excited to announce that in Android L, the WebView v36 fully supports Web Audio and WebRTC, allowing those building native apps to bring these technologies to the over 1 billion Android 30 day active users. [APPLAUSE] So the mixing is finished. Let's come full circle back to my Nexus 7 and listen to our song. Well, it looks like our Nexus 7 has decided it has other plans. That's no problem. Here we go. [MUSIC PLAYING] [APPLAUSE] JUSTIN: So give yourselves a hand. You guys were fantastic. Thanks for singing. And if you want to listen again, you can download the song from the Soundtrap site. [APPLAUSE] So now that we've got you all inspired about what the modern web platform can do, you can try it out for yourselves. If you haven't already checked it out, come get your hands dirty at the WebRTC Code Lab, running here at I/O until the end of today, and also available later online. There are also some great resources on WebRTC, Web Audio, and Dart written by experts in an excellent way to try out these technologies. And that wraps up our session for today. Now we want to hear from you. You can give us feedback on the session at the address listed here. And we also have a few minutes to take audience requests, I mean questions. Thank you. [APPLAUSE] PER: Thank you very much, thank you. AUDIENCE: So first of all, thank you for the great talk. It was very inspiring. And the song is really cool. I wanted to ask about audio latency on Android. Did you get it low enough? JUSTIN: So that's a bit of a work in progress I'll have to tell you. We've made a lot of improvements to latency through ICS, Jelly Bean and KitKat. We're still not to the point where we can get the-- what we really want is like 20 millisecond latency on Android. We're doing a lot of work there, including being able to expose all the APIs through OpenSL, through the C++ layer, so that we can have the minimum latency possible. So we're not where we want to be yet. But we're spending a ton of energy on it. AUDIENCE: For L, are we going to have it? JUSTIN: Ah, I'm not going to make any promises. Just we're working very hard on it. AUDIENCE: OK, thanks. AUDIENCE: Hi, I missed the beginning. Maybe you guys touched on this. But I was curious about MIDI support and being able to hook up a synth and maybe get down with the piano or organ or something like that. PER: Yeah, it's Web MIDI, which is also part of the standard. So we hook up the MIDI directly to the USB port. JUSTIN: Right, so Web MIDI is kind of a very nascent standard. It's supported provisionally within Chrome. We hope to promote it in the next version or two to be full API and see it picked up by other browsers. But you saw Web MIDI in action here when they played the keys part. And Soundtrap is making full use of this. And it is a great API in the web platform. PER: Yeah. AUDIENCE: Sorry, what kind of server side is included in this demo? So I wanted to ask how the tracks are joined together when you publish the track, the project. PER: How they work together? AUDIENCE: On the server side or on the client side here in Soundtrap? PER: Yeah, so the music is stored on the server side, yeah. But then, it pushed out to the client. So we're using a Play framework-- I don't know if you know that-- on the server side. But it's Dart on the client side. AUDIENCE: [INAUDIBLE] PER: Sorry? AUDIENCE: It's mixed online? PER: It's mixed on the server. But when you play it in the studio, it's played directly on the client. But the last mix that you heard was mixed on the server, right? JUSTIN: Right, so in the actual editor, when all the tracks are being played, it plays in the client. But when we do the final mix down to make the actual output MP3 file, then that's going to go and be done by the cloud service. PER: I see, this is only the [INAUDIBLE]. JUSTIN: Oh, right. So when we're in here in the studio-- PER: Everything here is played directly in the client, in the studio. So if I-- oh, now we have no audio anymore, I think. [MUSIC PLAYING] So that's played directly on the client. Yeah. JUSTIN: OK, thank you, everyone, for coming. Have a great rest of I/O. [APPLAUSE]
B1 justin audio web guitar api bass Google I/O 2014 - Making music mobile with the Web 76 8 songwen8778 posted on 2016/08/04 More Share Save Report Video vocabulary