Subtitles section Play video Print subtitles (ding) - Hello, and welcome to a Coding Challenge, Quick, Draw edition. Now, I have been talking about doing this for a very long time, and I'm excited to finally try this on my channel. One of my favorite data sets that is out there in the world is the quick draw dataset. Now, here's the reason, one of the reasons why I'm interested in this is not just this dataset of 50 million drawings, which is interesting and fun to play with on its own, but there is something called Sketch RNN, which was developed by a set of researchers at Google, Google Brain, and you can see some of them here who wrote this paper, and explained how Sketch RNN is a neural network, a recurrent neural network that learned about how to draw various things from the quick draw dataset and then can try and imagine and create new drawings based on how it learned and can even interact and draw with you. So many possibilities. So, this is where I'm going with this. I am going to make... Sketch RNN has recently been added to the ML5 library, (ding) and I'm going to show you an example, and I'm going to build that with Sketch RNN ML5, but I feel like before we start making the artificially intelligent system that generates the drawings, let's look at the actual data itself that it was trained on. So first, where did that data come from? So, and apologies if I get anything wrong, please let me know in the comments, 'cause this is not my project, I am just inspired and enthused by it, so the quick draw project is a project, the AI experiment, made by friends from Google, and it is a game that you can play where you say draw a pencil in under 20 seconds, okay here we go, (vocalizing), - [Robot] I see marker or lipstick. - No. - Or crayon. - No, no that's really like a pencil. If I put an eraser here. - I see rocket. - No, rocket, I'm the worst. Aah. - [Robot] I'm not sure what that is. - Yeah, I don't know what that is either. (ticking) Time is runnin' out. - [Robot] Sorry, I couldn't guess it. - All right, let's try a basketball. - [Robot] I see nose or moon or blueberry or baseball or bracelet. (laughs) - [Robot] Oh, I know, it's basketball (ding) - All right, I win, okay, so you get the idea. I could be stuck here for quite a while. Now, what you might not, when you are playing this game, your doodles are being collected, and over 15 millions of players have contributed millions of drawings playing Quick, Draw, oh and I've used this before, right, I made a example with a neural network that tried to recognize your drawings. This has been done on my channel before, but what I haven't actually looked at, what I looked at before was I looked at all the drawings as pixels. What's actually, what's interesting about the data, is that the data which you find here, information about it on GitHub, is not pixels, it's actually the pixel paths of the people making the drawings with timing information. So you could load that data and replay any drawing back, and each data, each drawing, has the word that was associated with it, the country where the person is from who drew, at least the IP address presumably, and then whether it was recognized and then the actual drawing itself. So, what I want to do, and you can see here that the format of the data is a whole of XY positions, XY, XY, XY, with timing, what time was I at the first point, the second point, the third point. Then, I might have lifted up my pen, moved and started doing another one, so it's a bunch of strokes. So this is, it's a little tricky 'cause I can't use the word stroke as a variable name in P5, 'cause stroke is a function that actually sets the pen color, but the idea is that if I do this, it's sampling a bunch of my points, as I drew along that path, each one of these is an XY point associated with a given time, and then there is an array with all of the Xs, all of the corresponding Ys and the corresponding times. Now, what I'm actually going to use in this video is if there are a bunch of different versions of the data, I'm going to use a simplified version of it because these are huge data files, but I encourage you as an exercise to try to do what I'm going to do but with the non simplified version, maybe with the timing aspect of it, but the simplified drawing files are the same exact thing, the same exact thing, but no timing information, and also they have been sub sampled, meaning in theory, as the person is drawing, as the user is drawing, a lot of points are being captured, but maybe you don't need that level of detail, and that's often referred to as pixel factor or scale factor, I believe, or epsilon value, I guess. You can say simplify all strokes using the Ramer-Douglas-Peucker algorithm, I don't know if I pronounced that correctly. With epsilon value of two. So, these are available as something called ndjson. Now, if you've watched my videos before, you're probably familiar with json, JavaScript object notation, that is a format where you can store data that's in JavaScript object notation. I have some videos about what is json. Ndjson is a funny thing, ha ha, so hilarious, it's the most, the funniest version of json, no, and it is actually a set of multiple json elements, each on a different line in a file, so it makes sense to do it that each drawing's its own sort of json object on a different line in a file. So, let's go grab one of these files, so getting the data, we can actually go to the public data sets. Oops, no, I'm sorry, I just want to go to the list the files in the cloud console, which is right here. I'm going to say I agree, and I don't want an email updates, but I accept, okay. Accept! So I'm going to go to full. I realize you can't see anything here, so let's try to make this bigger. Let me dismiss this right now, and come on. I guess I'll make this smaller, and I'll just zoom in. So these are the different formats, they're actually all the data in binary, there's this numpy.bitmap, which is useful for other kinds of machine learning, different things you might want to try. The raw data, but let's look at the simplified data, and let's pick, oh, I don't know, which model should I pick? There's so many, banana, bandage, baseball, basketball, bat, beach, bear, beard, I guess I should do beard. Right? That's kind of lame though. Birthday cake, is there a unicorn? Maybe there's a unicorn. No, was there a rainbow? Yes, there's a rainbow, (ding), all right, so we'll use the rainbow. So I am going to, oops, download this file. So here's the thing. This is a very large file. I had a reason why I was doing this challenge also. This is a 43 megabyte file. Now I could just use some bode in my client side JavaScript to load that file and put it on the web, and at some point, I might show you some techniques for doing that, stay tuned in the future, but I think this is a good case where my video series, sort of module for my programming from A to Z class, or the program with text class, building an API with Node and Express, this is a case where I've got this, what if I wanted to have every drawing, some of the just millions of them. I don't want to load hundreds of megabytes and gigabytes of files in my client side JavaScript. I could write a little Node program whose sole purpose is to hold on to all that data, and my client side JavaScript could just request it. So this could be because what I want to do is create an API out in the world for people to get drawing information, but this isn't data that I own in a way that I would necessarily do that. We'd have to look at the licensing to see if that's even something reasonable to do. Where is that eraser? But, what I can do, is on my computer here, right, the idea here is oh, I'm going to make a server, and the server is going to hold all of the drawings, and then my P5 sketch can just say, hey, can make a request, like a get request, please, could I have a rainbow? And then the server's going to send back just a single drawing. It's not going to send back hundreds of megabytes of data, it's storing all the data, but it's going to send back just one piece. The interesting thing is this server can easily just also run on the laptop. And I could connect to it, so there's a variety of ways you could deploy this and use this, but I'm going to do it all from this laptop. All right, so, to run a server with Node and Express, you could go back and watch some of these videos where I step through this in more detail, I'm just going to start it in the directory in my console, then I'm going to say npm init, and I'm going to call this codingtrain_quickdraw_example, and it's version 0.0.1, it is an example that I am making on the Coding Train, and you know, whatever, I'm going to skip through a lot of this stuff. Yes. Okay, so now, if I go to my code, you can actually see I have this package.json file. The package.json file has all that information that I just entered. This is the configuration file for my project. Notice this, we're central manager of this project now. So, I need a couple Node packages to be able to make this work. I need to use express, express is what I'm going to use to handle that get request, this http get request. So I'm going to say npm install express, and then I also need something to load that ndjson file. So ndjson Node, let's just, I've actually used this before, but let's look. So this is a Node package for loading an ndjson file, so I'm going to say npm install ndjson. Great, there we go, and now, I meant to show you what does that ndj, oh I got to grab that file now, so I also need, I'm just going to change, rename this to rainbow.ndjson, I'm going to drag it here into my project, so now this is a huge file, and so you can see that Visual Studio Code is freaking out, it's like I don't want to deal with this file because it's too big, but you can see that what this is is every single drawing on one line, so it's like this is my database, essentially, database of rainbow drawings. I have a database of rainbow drawings, what could be better? Okay, so what was I doing? Back to the code in the server. Where, oh, I don't have a server yet, I'm going to add one, I'm going to call it server.js, I could call it app.js or index.js, and here, I'm going to go back to this, and basically I just want to do exactly this. So the first thing, I want to use this, I need the file system module, so I'm going to say const fs = require file system. File system is a module that comes with Node, I don't have to install it, but I also want the ndjson module, which, it doesn't come with Node, but I've added it. And, here we go, and we can see, by the way, that when install those, they are now dependencies in the package.json file. And now, (vocalizing) ah, there we go, now this is, so what is this doing? This is streaming it, so this is really useful. It's a huge file. Rainbow.ndjson, I certainly could load it just using, loading the file into a big string, chopping it up and parsing it, but when you have a big file, like an ndjson file, you want to read it as a stream, essentially one line at a time 'cause it could be a gigabyte file. I'm not going to, in this case, I'm just going to say, I'm going to make a empty array, and every single object, I'm just going to push into that array, but let's console log them just to see that this is working. So this is the stream. As it reads line by line by line, the ndjson file, it's going to console log that object, okay, so let's go here, and I'm going to say node server.js, and there you can, you can see this is it, this is every single drawing, it's going to take quite a while 'cause there's thousands and thousands and thousands of them, but you can see, this is the word, this was the country code, this is whether it was recognized, it has an ID, and then drawing is in these arrays, which aren't console logging, but I can get access to them, wonderful. So I now have an array that has every single drawing in it, now how do I get access to that? I need to be able to make a get request to the server. So let's see how we would do that. So I need to make an express servery thing, let's just look up express Node, and go to the kind of like quick getting started, hello world, the hello world express example is all we need, basically. I'm going to grab all of this, and I'm going to put it into my code. So what's going on? Number one is I need to require the express library, I need to create an app, which is calling the express function. I'm adding the semicolons, gosh darn it, I need semicolons to live, I can't, I can't do without them. I need to pick a port, so port, this is somewhat arbitrary, but I'm going to use the port 3000, and then I'm going to setup a route, so the idea, and I prefer to be a little more long winded about this, this is using the arrows syntax, which is a kind of ES6 JavaScript syntax. And I'm just going to, I just have to do things the way that I do them. So there's two functions that I care about with my app, one is that I needed to listen on the port, so this, I'm setting up the, creating a server, and that server is listening, 'cause ultimately, I got to get to that P5 sketch, it's going to make the drawing, I haven't even gotten there yet. Now, I then want to setup a route, and then when the user makes a request to that route, send something back. So in this hello world example, if I run the server, and go to local host 3000, it says hello world, but that's not what I want. I don't care about sending hello world. What I want to do is let me make a route called rainbow. Then what I'm going to do is I'm going to say let a random number = math.floor, math.random, times drawings.length, so however many drawings have been loaded when someone goes to this route, pick a random one, and then I'm going to say, and this could be a const, I guess, and I'm going to say response, send drawings index r, and I suppose I should call this index, so now, whoops, index, let's rerun the server, and there is a tool called nodemon, which well reset the server for you, I'm going to do this manually, and then I'm going to go here. Cannot get slash because there is no route anymore at slash, but if I go to slash rainbow, there we go. There is the drawing. All right, I just installed a Chrome extension to format the json so I could see it, so here's a random drawing, and this is all the information. Now, all I need to do is have P5 request json from this route and then render the drawing. So now the questions is where do I run my P5 sketch, and there are a variety of ways. In theory, this is an API that anyone could make a request to, whether or not I'm opening it up for other people to request to it or not, is a complicated question, but one way that I could use it is just have this particular server host a P5 sketch in the first place, so the way to do that, if I go back to my files, and I go to desktop, quick, draw, this is where all the files are, I'm actually going, I have a P5, the HTML file and a sketch.js file in here, but I'm going to make another directory called public, so these would be the, where I want files that are hosted by the server to live public, and then I'm going to say something in my code, app., I don't remember, static file hosting express. Serving static files in express, it's just this, so basically, what I want to do is serve up the HTML and the JavaScript files as well. So I'm going to do that here, I'm going to add this. So now, look at this, now, and let's go to the P5 code, and let's say background zero. So I, all this P5 code does is create a 100 by 100 canvas with a background of zero, so now, guess what? If I go to local host 300/rainbow, I get a drawing 'cause I'm handling that rainbow route with a, by sending back a drawing, but I if I go to just slash, oh I didn't restart the server, did I? Restart the server, go to slash, there's the P5 sketch. So now, my P5 sketch can finally ask for the server for the drawing, okay. I'm going to go over here, and I'm going to say, first of all, one thing is, by the way, that simplified dataset, all of the simplified version of the quick, draw dataset, all of the drawings were simplified or scaled to 255 by 255 pixels, so that makes things easier to work with. I'm going to call the function load json, and guess what? I'm just going to say load json rainbow gotRainbow, all right, and then I'm going to write a function gotRainbow that gets some data, and I'm going to say console log data, so this is the idea, now, if you've seen load json before, maybe before I've used it for load this actual json file, or maybe I said load json from an API like Wordnik. Now, I'm going to the slash rainbow route, which is local to this particular server, and guess what? I don't actually even need to restart the server 'cause this will be loaded dynamically. So let's go here. And we can see, there it is. This is the rainbow drawing right here. I'm going to give myself some more room, and here's the drawing itself. So all I need to do now is write an algorithm to go through and draw this drawing. All right, we're ready. So let me make the background 200, let me say the drawing is in data.drawing, is that right? Console log drawing, let's look at that. Yeah, so this is the actual drawing. It's just two arrays 'cause it was just two strokes. Now I am going to say for let i = 0, i is less than drawing dot, oh, let me figure this out, this is an array, oh right, oh weird. Sorry I'm lost, oh right, okay, so, ah, the drawing, this was only one stroke, that's why this was confusing here. Some of these rainbows, there we go, this is what I want to look at. I have three different strokes. So first I need to look at all the strokes, sorry. So I want to say let, and I'm going to call it a path. So for let path of drawing, this is each and every path, path zero, path one, path two, then each path has a bunch of points, path zero, it has 15, path one has 10, path two has six. I'm going to say for let i = 0, i is less than path, path index 0.length 'cause this, and then, the x is path index 0 ,index 1, wait no, index i, sorry, this is confusing, and the y is path index 1, index i, right, so this is what I'm doing. I am looping through zero, one, two, that's the outer loop, each path, each path is two arrays. Path zero is all the Xs, path 1 is all the Ys. I need to look at all the Xs and all the Ys, and then set a vertex X, Y, so I can say begin shape, end shape, I can say no fill, stroke 0, whoops, stroke 0, and maybe I'll say stroke weight 3, just to make the lines a little bit thicker, and let's see what I see. There we go. Rainbows, rainbows galore, these are everybody's rainbows. Each time I hit refresh. You know one thing I could do now is when it finishes, I could just say load json again. Ooh, maybe I would want to redraw the background every time, that might make sense. And here we go, this is a random drawing over and over and over again, so, I could start to do things like request a specific drawing from a certain country, I could download different, I mean, you know, different models, let's just, let me pause for a second and grab another model. Okay, so I downloaded one more set of drawings, the cat files. So I'm going to, the cat drawings, so I'm going to copy that into here, and we can see now I have cat ndjson. If I go back to my server, I could do, I'm going to say, I'm going to call this rainbows, and I'm going to do a different one for cat. And I'm also going to do cats, cats push object, and then I'm going to make another route for, for cats. So now, if I rerun the server, and I go back to my actual sketch, and I switch to going to the cat route, now where was that? Here I am, I'm going to hit enter, ooh, I got some issue. Cat, internal server error, so what's going on here? Drawings is not defined, so I made a mistake in my server. Oh, this is, over here is rainbows.length, and this is cats.length, and I would have seen that error here if I was paying closer attention. There, now I've got cats, and now, ah, let's look at a lot of cats. Ooh. Ooh, it's still giving me rainbows. Did I not hit save? Load json cat, oh, load json cat, whatever, I'm not being too thoughtful about this, give me the cats, I want to see the meow meow. What's going on? Aah. (buzz) this is what I get for trying to code so quickly. This is supposed to say cat.json, cat.ndjson, now here we go. Oh wait, I have to restart the server. And, here we go. Finally, cats, there's a lot of different cat drawings. I really should slow this down, let me just slow this down a little bit. Here's what I want to do actually. Oh, this video should really be over, but why not? You've already watched this much, you could watch a little bit more, right? I really want to draw the drawing in sequence. Now, I'm not, I don't have the timing information, and that would be useful to have, but let's make it actually animate. So I'm going to add a draw function. I'm not going to add a page transition event, and so when I've got a cat, and I'll just change this. What I'm actually going to do is just set current, I'm going to just set cat equal to data, so I'm going to take out all of this, cat equal to data.drawing. So I'm going to comment this out. Let's think about this, and then I'm going to say let x, y, and I'm going to have, I'm going to say if cat, then I now need to keep track of where I am. Let stroke index = 0, let pen index = 0. So I need to keep track of two indices, right? Because I'm going to walk through one at a time, each vector of the first stroke, and that stroke's going to go from zero to one and go through each of the other ones, so if there's a cat, the first thing I need to do is say, so if, I'm going to say x =, and what was this stuff? It is path, oh, drawing, so cat index, stroke index, index, pen index, index 0. Boy, this is really awkward about how it's using just arrays for everything, but in the first stroke, in the first, pen is not the right term, I don't know what to call it, vertex, but whatever, I could actually just call this index maybe. The stroke index and index, then 0 is for x, and then one is for y, and let me just, just to see that this works, let me say point, let me say point x, y, and these don't need to be global. So let's see what this does. So first of all, let's just run this. Oh boy, I freaked it out, it won't ever stop. I think, by the way, I killed this, I need to build in a little more of a delay with these API calls. So cat is not defined, sketch.js line 12. If cat, that needs to be a global variable. So, and let me just say here, console log x,y, let's see, did I get x, y? Yes, so I've got that first point over and over again, and presumably, 52, 48, I don't know why I don't see, I guess I need to say stroke 0, stroke weight 3. There we go, so there it is, that's the first point, so now what I need to do is say index ++, if index is greater than or equal to cat stroke index dot length, then stroke index ++ and index = 0. So this is me marching through them one at a time. So, ooh, and I don't have the Y. Right, you can see that something's wrong here. (ding) Okay, something is terribly wrong here, and actually, I have not been carefully looking at how those arrays are organized, it's very confusing to store all these data as arrays, but there are 11 strokes, and this stroke has 23 points, this stroke has nine points, but notice that the, I have the order wrong, right? This is an array of an array of arrays, and so basically, the stroke, the zero element of the stroke is all the different x values, and this one element of the stroke is all the different y values. I had those out of order, and then here, the number of points is not the number of strokes, but rather, the number of Xs. So now, if I redo this, you can see the outline of a cat there. You can start to see the outline of a cat here. Of course it gets stuck at the end, it's giving me an error, so first let me fix that error. So the error that I need to check is if stroke index equals cat.length, then I'm done, then I'm going to say cat = null. I'm going to say no more to the cat, and there we go. So this is the drawing of the cat. Now, of course, I'm just drawing all the points, I need to connect the previous points to the other points, so I'm going to add a previous x, previous y, and then I'm going to say here down here, previous x = x, previous y = y, and then here, I'm going to say a line between previous x, previous y and x and y. Now, it should do nothing when those values are null, so now we see there, ooh, oh wait a sec, no, no, no, no, no, no, ah, when I get to the next stroke, then I need to say previous x = undefined again, and previous y = un, 'cause I don't want to draw, I don't want to connect the strokes. So that's a little bit of an awkward way of doing it. It's still doing that, isn't it? So, and then I want to say if previous x, maybe if I do this, does not equal undefined, then draw the line, let's see if this works, whoops, sketch line 19, I always have this extra equals there. Oh, weird. It's still connecting everything, a lovely little cat there. What am I missing? Right, I don't want to draw the line. Uh, these are undefined at the beginning, oh it gets set to here, so I need an else here. Else, don't set it if it's at the end, okay. There we go, finally, we are drawing cats. Now, all I have to do is then when I reset there, I can just ask for a new one. So let's ask for a new cat, and whenever I've got a cat, let's draw a white background, let's make, a little bit gray, we'll set it gray at the beginning, there we go, now, here we go, we're now going to draw lots of cats, it should finish one. Ooh, didn't get another one, sketch on line 13. Uh... Cat is undefined, and then, there should be no more cat until I've got a cat. Try that again. There we go, I don't know what I did wrong. Ooh. (ding) Thank you to BIMSoMe and Louise, both in the chat, who just pointed out that my technique here is correct, but the issue is that I need to reset everything back to zero, so here, I need to set stroke index back to zero, and I think index will already be zero, yeah index is already zero, so yes, the stroke index needs to go back to the beginning, and now I think we're ready to enjoy a whole bunch of cats. (upbeat music) Cat drawings. All right, thanks for watching this Coding Challenge with the Google Quick, Draw dataset. Stay tuned for a future video where I show how to, what do I do? This is where I show how to create new drawings with the Sketch RNN model, the machine learning model that was trained on these drawings, and if this was one of your drawings, (blows kiss) thank you for making this beautiful cat, and I'll see you in a future Coding Challenge. Good bye (ding). (upbeat music)
B1 drawing index stroke json server file Coding Challenge #122: Quick, Draw! 2 0 林宜悉 posted on 2020/03/27 More Share Save Report Video vocabulary