Subtitles section Play video Print subtitles Creating an AI Musician with JavaScript Thomas Drach KATIE: hello? There we go. That's me. Good morning. Whoo! Is everybody ready? Yeah. We have a really, really cool talk to start the day with. Thomas Drach is here. And when I, you know, he gave me some fun facts. But actually we had a really fascinating conversation just now that I'm gonna share with you. So, back in the '90s, the first two CDs that I ever bought. The first was DJ Jazzy Jeff and the Fresh Prince, the rapper, parents don't understand, which was amazing. And I bought a Rick Astley CD with a famous song on it that people use to Rickroll each other. And Thomas said his first tape was Nsync. Clearly his musical taste is much better than me. Let's give it up for Thomas Drach. [ Applause ] THOMAS: Good morning. Thanks for being here early for this talk. I want to give a huge thanks to JSConf for having me. They have been awesome. So grateful to be here. I also want to thank any open source contributors. Because I feel like I'm using cheat code sometimes, just like using someone else's code. If you're anything like me, I may have squirmed a little bit at this AI acronym. And that's good. We're going through that today. But just your willingness to be here, shows you're open to the ideas and pushing the boundaries. I think that's commendable. Thanks for being here. Okay. My name's Thomas Drach like we talked about. Thomasdrach on Twitter if you want to bug me there, please do. I consider myself a designer and a bit of a hacker. Not in the sense of this awesome movie. But in the writing bad code sense. I'm really good at that. I have a little design studio called Subtract where I make what I hope are useful products. One of which we'll go through today. And I have a product called Cleverstack too if you want to check that out. So, I want to start with a man called Paul Thomas. He had a little garage in Phoenix called the Thomas Brothers Garage. Today people might call him an entrepreneur or a founder. But he was actually an inventor. He filed for over a hundred patents and I'm still trying to track most of them down. I found about a dozen of them. Most reminded me of the Rube Goldberg machines. I don't know if you have seen these before. They're simple machines that create this weird trap like mechanism. All the patents I found from him were complex machines like that. Like nothing you would actually use. It's kind of hard to see here. But the name of this one is panel manufacturing method. And it's basically a patent for this giant machine that creates these like brick and concrete slabs and then we're supposed to ship them to job are sites for people to build houses with them. Kind of made me sad a little bit because it just seemed like a crazy person was documenting and patenting all of these things. But the machines that get built. This is the one that we were just talking about. The manufacturing machine to make the giant slabs. And they actually had to design and patent these, like, semitrucks to ship them to job sites. They had these weird little trucks that would like transport like these palettes of bricks everywhere. And clearly they didn't invent the screw or the lever or the racket pinion or these things, but they combined something to make something new and useful. And it's especially interesting to me because this man was my great grandfather and my namesake. Some of you might be familiar with this Henry Ford quote. The funny thing is, Henry Ford never said this. I did research and the first time it was attributed to him was in 1999 in the cruise industry news quarterly. And other people started using it and now people say it on stages like this. Sometimes it's paired with like the Steve Jobs quote and kind of create this is genius complex of, they didn't know what they want. We have to show them or whatever. But I think there's something that people miss about this quote in particular. I think it resonates for a reason. But my interpretation of this quote is, big progress isn't necessarily just like an iteration of the last thing, but it's like a mutation of something that happened before. Maybe a little bit like this. We could accidently combine a few unrelated things to find something new. This is Tim Berners Lee talking about inventing the Internet. He said I just had to take the hypertext idea and connect it to the TC P&D NS ideas and can be ta da, I had the World Wide Web. There's an old LinkedIn, his profile page, it just said like web developer. But he goes on this interview, and I recommend listening or reading interviews from him. He goes on just to attribute all these other inventions and says if these didn't happen, the Internet, at least I wouldn't have created it at that time. I don't know what would have happened. So, the definition of mutation in the dictionary is the changing of structure, resulting in a variant form that may be transmitted to subsequent generations. Hendrix famously took right handed guitars, flipped them upside down and then eventually changed music. And he did so in part because it was before Les Paul invented the electronic guitar. They didn't invent it to invent it, but because they wanted the acoustic guitar louder. And Grace Hopper, is one of the inventers of what we now call programming. A big reason we're here today. And she started with knobs and switches on the Mark I. All of these were mutations. Like it was different enough. Hopper's was a mutation. And I think AI is a built of a mutation, at least how we talk about it today. There's much more data, advances in machine learning, compute power thanks to Moore's Law. And it kind of created the opportunity for something like AI to work. This is what I get when I search "AI" on Google. I don't know about you. But this isn't very helpful for me. So, I'll ask a little bit different question today. I want to ask, what are intelligent machines? We might be able to define this. Just intelligence + machines. So, let's define intelligence. This is a quote; I'm just going to read it really quick. People generally distrust the concept of machines that approach and thus why not pass our own human intelligence. I think a lot of people feel like this today. And this quote was actually written in 1970 in the book called the architecture machine actually by the person who founded the MIT Media Lab. And it goes on to say that, machines must be aware of their context in order to be intelligent. So, you can't have like a machine without using the context, interacting with the world. It's not intelligent in that case. There's no lack of context in the new Tesla roadsters. So, for our purposes, and I'm just gonna say intelligence means using context. So, now we can define machines. This should be pretty easy. We go to the dictionary and find a mechanically, electronically, or electrically operated device for a task. Sounds good to me. Okay. So, with intelligence and machines defined, I would like to introduce you to the concept of somewhat intelligent machines. And this is what we're gonna build today. And this is just something that uses context and rapidly completes something that a human could not. And we're gonna do all of it in JavaScript. So, this is the actual machine instrument, musician, AI, whatever you want to call it. This is what we're gonna build today. I'm going to walk through how to generate drumbeats using pre trained machine learning models, APIs, libraries, stuff like that. We're going to piece it together. And I find it a little bit hard for me to follow tiny code so it's gonna be a little pseudo codey. Like I said, the first thing we needed were a couple libraries we're going to use Magenta. If you haven't heard of Magenta already, please check it out. It's incredible. A couple of people have talked about it already here at JSConf. And then we're going to use Tone, which is actually a dependency Magenta which gives us an easier to code interface for musical stuff. All right. Let's play some drums. This is what the data structure for the drums will look like. You can set up a step sequence, but this is a step sequence in Magenta. There's a pitch, there's an attribute that tells it's a sample based pitch, not a tonal keyboard like thing. And there's quantization info. There's a method that does that for you so you don't have to worry about it. Okay. So, all we need to play that note sequence is two lines of code. We're going to create a new instance in the Magenta music player. And I'm going to call it player.star on that. And we're gonna get something like this. [drumbeats] This is just our basic pattern that we plugged in. Right? It's not that exciting. We kind of want something a little bit more like this. Like feed it in and we'll get something better in the material. All right. But in order to do that and do super quick ML crash course. I am not the one to go in depth about this. But let's all get on the same page. Okay. So, usually write functions something like this. We want something, we put something in, we want something back. Machine learning, it's a little bit more abstract, right? We don't know abstractly like what we how we would get there. Like, is this image a dog? Here's the image. I don't know. Some of you might see [ Laughter ] Memes like this. This is one of my favorites. Chicken fingers and the goldendoodles, I think. I don't know how dogs and cats became like the hello world of machine learning. But I'm not mad about it. So, here's what we call training data. If it was training data, it was probably labeled. So, this is a dog. I probably should have said like fried chicken. That's not an actual chicken. So, you feed all that to the machine. The machine says all these are dogs. They have this weird odd thing on their face. We call that a feature. That feature to us looks like a nose. The machine goes, okay, there's a nose. It's probably a dog. So, we feed them the image and it's gonna guess, dog. All of these are just like probabilities. For our purposes, we want to give it some drums and we want some better drums in turn. So, that's where Magenta comes in. Magenta has a couple different models available. All of these are super cool and it seems like they're coming out with more every week, every month. So, there's a MusicRNN model, a Music VAE and a Piano Genie. Right now the Piano Genie is a VAI as well. Just quick, RNN stands for recounter neural network, a bunch of nodes. It's like one of those. But it loops through itself. And a VAE is a variational auto encoder. If you're familiar with encoding and decoding, it works similarly to that. For our purposes, we're going to use the MusicRNN models just in the context of Magenta. They have a little bit better support for like individual instruments like drums. And this is kind of what that might look like. So, if you have nodes on the network, you have it looping through itself and you have an in and an out. For us, we're going to put in our initial drum beat and we're going to expect a generated drum beat in return. Okay. So, we picked our MusicRNN models. This is what the actual checkpoint is. So, this is like a pre trained model. Trained with millions of drumbeats and it has a sense of what drumbeats are. There's a kick on one, there's a snare on two or something like that. So, these are the three lines of code that we need to generate a new drumbeat. So, you just create a new instance of our MusicRNN model, the checkpoint that we had. We initialize the model, it loads itself up. And then we call this method continue sequence. We feed it in our note sequence, feed the number of steps which is kind of arbitrary. Could be 16, 32 or whatever. And then we feed it a number from zero to two. We'll go over the temperature a little bit letter. So, after we do that, we just get a sample in return and we play the same way we played the other one. So, this is what that looked like. This is gonna be generated beat with a temperature of 1.5. [[drumbeat] And if you generate it again, it's going to come up with now beats that we've never heard. Cool. So, yeah. [ Applause ] all right. That was cool. But it was a little bit of a blackbox. So, I want to go through what happens when we call them with a continue sequence. We call it here in the three lines are of code. All we're gonna do here is what's happening behind the scenes is we're gonna convert the note sequence which is that drum thing. We're going to convert it to a Tensor. And then we're going to encode the tensor to match the model, the checkpoint that you have. If you're wondering what a tensor is, you probably already know. If you remember municipal math, is scalars and vectors. A tensor has three dimensions. That's why you hair the word shape when talking about machine learning. And especially TensorFlow. These are all but the last is tensor. And then there's an internal method called sampleRNN. The inputs go into the TensorFlow library and generates the next notes. If you want to get into the nitty gritty, TensorFlow.JS is a great place to actually get your hands dirty there. It helps me to visualize like this. Once more, the continueSequence. And the note here, the noteSequence, convert it to a model, call the sampleRNN and get the new drums. I told you we were gonna talk about temperature. It's interesting to me because it's one of the few inputs we have available. Restructure train it with different drumbeats, which is kind of cool. But temperature is like the level of entropy in the system. So, the lower the temperature, the more predictable result we're gonna get. The higher, the less predictable it will be. So, just as an example, and drop it down here to 0.2. Sounds like really similar to the original drumbeat. And if we keep generating it, it's pretty much like the same, right? So, now we're gonna try cranking it up to 1.5. [Drumbeats ] So, a little bit more exciting, for sure. This is the temperature I like. More fun. And after we do that, we just have like a little demo button here. It will generate a new file. And then sometimes what I will do is I will drop it in this, garage band and use it as a musician for my band. If you're wondering why there's no audio right now, it's because we're not judging my music skills today. We're talking about JavaScript. Okay. So, that was cool. It was like almost somewhat intelligent. I wanted to take it one step further. So, I wanted to give the machine a little bit of motivation with applause. So, depending on how much you applaud it, the machine would then generate a new temperature. Here we go. The machine would generate a new temperature based on like the average amplitude of a couple seconds period of time. I wanted more context to have a better definition of our somewhat intelligent machine. So, I literally injected more context into it. So, this is pretty simple. I'm just getting the user's microphone. And I have this little method here called analyze sound. I'm going to use create script processer and just take the average volume over a couple seconds. Okay. And against my better judgment, we're gonna do a live demo. Okay. So, this is the drumbeat. That's the normal drum beat we programmed in. Then we can generate one. Drop it down a little bit. So, this is like pretty cool. Generating a new one every time. Okay. Now I need your help. So, I've created this little perform feature. When I click the button, it's going to wait for applause for a couple seconds and then it's gonna take that average amplitude over that period of time, decide on what temperature to play, and then generate the beat based on that. I promise I'm not trying to manufacture applause for myself. Maybe a little bit. Okay. So, let's try this. On the count of three, be like nice. But loud. I'm on the count of three, start applauding. I'm going to hit the button right after you start applauding and then we'll see what happens. Live demos always work. So with, this should be great. On the count of three, one, two, three. [ Applause ] Yeah! All right. So, that's that. So, it actually goes from like zero to 2. It goes up pretty it still is morning. But you're being considerate. I'm fine with that. Cool. So, that's that. That is our somewhat intelligent machine. So, did we use context? I think so. We put in our drumbeat. We took applause. We told it the steps we wanted. It definitely rapidly completed something that we couldn't do on our own, right? We can generate like a dozen or so drumbeats just in a couple seconds. So, I think we did it. Other people have created some really cool things. This is called a neural computer. Usually play a couple notes, an arpeggio, bounce back and forth. But this will take the temperature into effect. It uses the improvRNN model from Magenta. I really like it. The Magenta team created kind of like what we just did, but inside of able10. If you use it, you can do what we just did, and generate right inside. And the Flaming Lips actually created this thing called the framing the Flaming Lips and Magenta created this thing called Fruit Genie. And it was fruit, but it would say like orange and it would feed the model. And then they created like these giant pool toy type things that had censors on them and then threw them into the audience and asked people to feed it into the same model and create this, like, melody. This is a little clip of what that looked like. >> Written this song especially for tonight's occasion. THOMAS: So, they threw out these things into the audience. And people and you could hear it in the melody like cycle back. There's not like >> Apple THOMAS: So, all of these things, all the stuff we just talked about. All of it was just Tone.js and Magenta and we created our own as well. We used a couple other previous inventions, sure. But that was kind of the point, right? Combining these simple machines to kind of create something more complex. We didn't reinvent the wheel, by any means. We didn't have to. We just created something a little bit smarter than it was before with the tools that I had at our disposal. I think we can keep doing this. We can keep like flipping our tools and creating things that are new and useful for people and helpful and interesting. And hopefully the inventions that we piece together, the sum will be greater than its parts. This is such an exciting time to be building stuff. And I can't wait to see what we all build next. So, thank you. [ Applause ] KATIE: Wow. Oh, my gosh, all right, I'm gonna gush for a second about the Flaming Lips. They're one of my favorite bands. I've seen them live four or five times. If you haven't seen them, even if you don't particularly love their music, it's an amazing experience. You should go and do it. I'm going to stop gushing about the Flaming Lips and now I'm going to gush about Thomas. That was really cool and I really love his message that, you know, like he's not some kind of crazy genius. He's just like a person who is really into music and really wanted to try something cool. And that we all could do this with JavaScript. It's like amazing, right? Anyway, so, coming up next we have Sophia Shoemaker is going to talk about building a PWA that had to work off the grid in an African country which I can't remember which one, but we need to be back here at 10:30 for that. So, you have a couple minutes to go out and switch rooms if you want. But you shouldn't. You should stay here. All right. Thanks, everybody. [ Applause ]
A2 magenta machine thomas temperature applause ai Creating an AI Musician with JavaScript - Thomas Drach - JSConf US 2019 3 0 林宜悉 posted on 2020/03/28 More Share Save Report Video vocabulary