Subtitles section Play video
(ding)
- Hello, and welcome to a Coding Challenge,
Quick, Draw edition.
Now, I have been talking about doing this
for a very long time, and I'm excited
to finally try this on my channel.
One of my favorite data sets that is out there
in the world is the quick draw dataset.
Now, here's the reason, one of the reasons
why I'm interested in this is not just this dataset
of 50 million drawings, which is interesting
and fun to play with on its own,
but there is something called Sketch RNN,
which was developed by a set of researchers
at Google, Google Brain,
and you can see some of them here who wrote this paper,
and explained how Sketch RNN is a neural network,
a recurrent neural network that learned about how
to draw various things from the quick draw dataset
and then can try and imagine and create new drawings
based on how it learned and can even interact
and draw with you.
So many possibilities.
So, this is where I'm going with this.
I am going to make...
Sketch RNN has recently been added to the ML5 library,
(ding)
and I'm going to show you an example,
and I'm going to build that with Sketch RNN ML5,
but I feel like before we start making
the artificially intelligent system that generates
the drawings, let's look at the actual data itself
that it was trained on.
So first, where did that data come from?
So, and apologies if I get anything wrong,
please let me know in the comments,
'cause this is not my project, I am just inspired
and enthused by it, so the quick draw project
is a project, the AI experiment,
made by friends from Google,
and it is a game that you can play
where you say draw a pencil in under 20 seconds,
okay here we go, (vocalizing),
- [Robot] I see marker or lipstick.
- No. - Or crayon.
- No, no that's really like a pencil.
If I put an eraser here. - I see rocket.
- No, rocket, I'm the worst.
Aah.
- [Robot] I'm not sure what that is.
- Yeah, I don't know what that is either.
(ticking)
Time is runnin' out.
- [Robot] Sorry, I couldn't guess it.
- All right, let's try a basketball.
- [Robot] I see nose or moon or blueberry
or baseball or bracelet.
(laughs)
- [Robot] Oh, I know, it's basketball
(ding)
- All right, I win, okay, so you get the idea.
I could be stuck here for quite a while.
Now, what you might not, when you are playing this game,
your doodles are being collected,
and over 15 millions of players have contributed
millions of drawings playing Quick, Draw,
oh and I've used this before, right,
I made a example with a neural network
that tried to recognize your drawings.
This has been done on my channel before,
but what I haven't actually looked at,
what I looked at before was I looked
at all the drawings as pixels.
What's actually, what's interesting about the data,
is that the data which you find here,
information about it on GitHub, is not pixels,
it's actually the pixel paths of the people making
the drawings with timing information.
So you could load that data and replay any drawing back,
and each data, each drawing, has the word
that was associated with it,
the country where the person is from who drew,
at least the IP address presumably,
and then whether it was recognized
and then the actual drawing itself.
So, what I want to do, and you can see here that the format
of the data is a whole of XY positions,
XY, XY, XY, with timing, what time was I at the first point,
the second point, the third point.
Then, I might have lifted up my pen, moved and started doing
another one, so it's a bunch of strokes.
So this is, it's a little tricky 'cause I can't use
the word stroke as a variable name in P5,
'cause stroke is a function that actually sets
the pen color, but the idea is that if I do this,
it's sampling a bunch of my points,
as I drew along that path,
each one of these is an XY point associated
with a given time, and then there is an array
with all of the Xs, all of the corresponding Ys
and the corresponding times.
Now, what I'm actually going to use in this video
is if there are a bunch of different versions of the data,
I'm going to use a simplified version of it
because these are huge data files,
but I encourage you as an exercise to try to do
what I'm going to do but with the non simplified version,
maybe with the timing aspect of it,
but the simplified drawing files are
the same exact thing, the same exact thing,
but no timing information,
and also they have been sub sampled,
meaning in theory, as the person is drawing,
as the user is drawing, a lot of points are being captured,
but maybe you don't need that level of detail,
and that's often referred to as pixel factor
or scale factor, I believe, or epsilon value, I guess.
You can say simplify all strokes using
the Ramer-Douglas-Peucker algorithm,
I don't know if I pronounced that correctly.
With epsilon value of two.
So, these are available as something called ndjson.
Now, if you've watched my videos before,
you're probably familiar with json,
JavaScript object notation, that is a format
where you can store data
that's in JavaScript object notation.
I have some videos about what is json.
Ndjson is a funny thing, ha ha, so hilarious,
it's the most, the funniest version of json, no,
and it is actually a set of multiple json elements,
each on a different line in a file,
so it makes sense to do it that each drawing's
its own sort of json object on a different line in a file.
So, let's go grab one of these files,
so getting the data, we can actually go
to the public data sets.
Oops, no, I'm sorry, I just want to go to the list
the files in the cloud console, which is right here.
I'm going to say I agree, and I don't want an email updates,
but I accept, okay.
Accept!
So I'm going to go to full.
I realize you can't see anything here,
so let's try to make this bigger.
Let me dismiss this right now, and come on.
I guess I'll make this smaller, and I'll just zoom in.
So these are the different formats,
they're actually all the data in binary,
there's this numpy.bitmap, which is useful for other kinds
of machine learning, different things you might want to try.
The raw data, but let's look at the simplified data,
and let's pick, oh, I don't know,
which model should I pick?
There's so many, banana, bandage, baseball, basketball,
bat, beach, bear, beard, I guess I should do beard.
Right?
That's kind of lame though.
Birthday cake, is there a unicorn?
Maybe there's a unicorn.
No, was there a rainbow?
Yes, there's a rainbow, (ding), all right,
so we'll use the rainbow.
So I am going to, oops, download this file.
So here's the thing.
This is a very large file.
I had a reason why I was doing this challenge also.
This is a 43 megabyte file.
Now I could just use some bode in my client side JavaScript
to load that file and put it on the web,
and at some point, I might show you some techniques
for doing that, stay tuned in the future,
but I think this is a good case where my video series,
sort of module for my programming from A to Z class,
or the program with text class, building an API
with Node and Express, this is a case where I've got this,
what if I wanted to have every drawing,
some of the just millions of them.
I don't want to load hundreds of megabytes
and gigabytes of files in my client side JavaScript.
I could write a little Node program whose sole purpose
is to hold on to all that data,
and my client side JavaScript could just request it.
So this could be because what I want to do is create an API
out in the world for people to get drawing information,
but this isn't data that I own in a way
that I would necessarily do that.
We'd have to look at the licensing
to see if that's even something reasonable to do.
Where is that eraser?
But, what I can do,
is on my computer here, right, the idea here is oh,
I'm going to make a server, and the server is going to hold all
of the drawings, and then my P5 sketch can just say,
hey, can make a request, like a get request,
please, could I have a rainbow?
And then the server's going to send back just a single drawing.
It's not going to send back hundreds of megabytes of data,
it's storing all the data, but it's going to send back
just one piece.
The interesting thing is this server
can easily just also run on the laptop.
And I could connect to it, so there's a variety of ways
you could deploy this and use this,
but I'm going to do it all from this laptop.
All right, so, to run a server with Node and Express,
you could go back and watch some of these videos
where I step through this in more detail,
I'm just going to start it in the directory in my console,
then I'm going to say npm init,
and I'm going to call this codingtrain_quickdraw_example,
and it's version 0.0.1, it is an example that I am making
on the Coding Train, and you know, whatever,
I'm going to skip through a lot of this stuff.
Yes.
Okay, so now, if I go to my code,
you can actually see I have this package.json file.
The package.json file has all that information
that I just entered.
This is the configuration file for my project.
Notice this, we're central manager of this project now.
So, I need a couple Node packages
to be able to make this work.
I need to use express, express is what I'm going to use
to handle that get request, this http get request.
So I'm going to say npm install express,
and then I also need something to load that ndjson file.
So ndjson Node, let's just,
I've actually used this before, but let's look.
So this is a Node package for loading an ndjson file,
so I'm going to say npm install ndjson.
Great, there we go, and now,
I meant to show you what does that ndj,
oh I got to grab that file now,
so I also need, I'm just going to change,
rename this to rainbow.ndjson, I'm going to drag it here
into my project, so now this is a huge file,
and so you can see that Visual Studio Code
is freaking out, it's like I don't want to deal
with this file because it's too big,
but you can see that what this is
is every single drawing on one line,
so it's like this is my database, essentially,
database of rainbow drawings.
I have a database of rainbow drawings, what could be better?
Okay, so what was I doing?
Back to the code in the server.
Where, oh, I don't have a server yet, I'm going to add one,
I'm going to call it server.js, I could call it app.js
or index.js, and here, I'm going to go back to this,
and basically I just want to do exactly this.
So the first thing, I want to use this,
I need the file system module,
so I'm going to say const fs = require file system.
File system is a module that comes with Node,
I don't have to install it, but I also want the ndjson
module, which, it doesn't come with Node,
but I've added it.
And,
here we go, and we can see, by the way,
that when install those, they are now dependencies
in the package.json file.
And now,
(vocalizing)
ah, there we go, now this is, so what is this doing?
This is streaming it, so this is really useful.
It's a huge file.
Rainbow.ndjson, I certainly could load it just using,
loading the file into a big string,
chopping it up and parsing it,
but when you have a big file, like an ndjson file,
you want to read it as a stream,
essentially one line at a time 'cause it could be
a gigabyte file.
I'm not going to, in this case, I'm just going to say,
I'm going to make a empty array, and every single object,
I'm just going to
push into that array, but let's console log them
just to see that this is working.
So this is the stream.
As it reads line by line by line,
the ndjson file, it's going to console log that object,
okay, so let's go here, and I'm going to say node server.js,
and there you can, you can see this is it,
this is every single drawing, it's going to take quite a while
'cause there's thousands and thousands and thousands
of them, but you can see, this is the word,
this was the country code, this is whether
it was recognized, it has an ID, and then drawing
is in these arrays, which aren't console logging,
but I can get access to them, wonderful.
So I now have
an array that has every single
drawing in it, now how do I get access to that?
I need to be able to make a get request to the server.
So let's see how we would do that.
So I need to make an express servery thing,
let's just look up express Node, and go to
the kind of like quick getting started, hello world,
the hello world express example is all we need, basically.
I'm going to grab all of this,
and I'm going to put it into my code.
So what's going on?
Number one is I need to require the express library,
I need to create an app,
which is calling the express function.
I'm adding the semicolons, gosh darn it,
I need semicolons to live, I can't, I can't do without them.
I need to pick a port, so port, this is somewhat arbitrary,
but I'm going to use the port 3000,
and then I'm going to setup a route, so the idea,
and I prefer to be a little more long winded about this,
this is using the arrows syntax,
which is a kind of ES6 JavaScript syntax.
And I'm just going to,
I just have to do things the way that I do them.
So there's two functions that I care about with my app,
one is that I needed to listen on the port,
so this, I'm setting up the, creating a server,
and that server is listening,
'cause ultimately, I got to get to that P5 sketch,
it's going to make the drawing,
I haven't even gotten there yet.
Now, I then want to setup a route,
and then when the user makes a request to that route,
send something back.
So in this hello world example, if I run the server,
and go to local host 3000,
it says hello world, but that's not what I want.
I don't care about sending hello world.
What I want to do is let me make a route called rainbow.
Then what I'm going to do is I'm going to say
let a random number = math.floor, math.random,
times drawings.length, so however many drawings
have been loaded when someone goes to this route,
pick a random one, and then I'm going to say,
and this could be a const, I guess,
and I'm going to say response, send drawings index r,
and I suppose I should call this index,
so now,
whoops, index, let's rerun the server,
and there is a tool called nodemon, which well reset
the server for you, I'm going to do this manually,
and then I'm going to go here.
Cannot get slash because there is no route anymore at slash,
but if I go to slash rainbow, there we go.
There is the drawing.
All right, I just installed a Chrome extension to format
the json so I could see it, so here's a random drawing,
and this is all the information.
Now, all I need to do is have P5 request json
from this route and then render the drawing.
So now the questions is where do I run my P5 sketch,
and there are a variety of ways.
In theory, this is an API that anyone could make
a request to, whether or not I'm opening it up
for other people to request to it or not,
is a complicated question, but one way that I could use it
is just have this particular server host a P5 sketch
in the first place, so the way to do that,
if I go back to my files, and I go to desktop, quick, draw,
this is where all the files are, I'm actually going,
I have a P5, the HTML file and a sketch.js file in here,
but I'm going to make another directory called public,
so these would be the, where I want files that are hosted
by the server to live public,
and then I'm going to say something in my code,
app., I don't remember, static file hosting express.
Serving static files in express, it's just this,
so basically, what I want to do is serve up
the HTML and the JavaScript files as well.
So I'm going to do that here, I'm going to add this.
So now, look at this, now, and let's go to the P5 code,
and let's say background zero.
So I, all this P5 code does is create a 100 by 100 canvas
with a background of zero, so now, guess what?
If I go to local host 300/rainbow,
I get a drawing 'cause I'm handling that rainbow route
with a, by sending back a drawing, but I if I go
to just slash, oh I didn't restart the server, did I?
Restart the server, go to slash, there's the P5 sketch.
So now, my P5 sketch can finally ask for the server
for the drawing, okay.
I'm going to go over here, and I'm going to say,
first of all, one thing is, by the way,
that simplified dataset, all of the simplified version
of the quick, draw dataset, all of the drawings
were simplified or scaled to 255 by 255 pixels,
so that makes things easier to work with.
I'm going to call the function load json, and guess what?
I'm just going to say load json rainbow gotRainbow, all right,
and then I'm going to write a function gotRainbow
that gets some data, and I'm going to say console log data,
so this is the idea, now, if you've seen load json before,
maybe before I've used it for load this actual json file,
or maybe I said load json from an API like Wordnik.
Now, I'm going to the slash rainbow route,
which is local to this particular server, and guess what?
I don't actually even need to restart the server
'cause this will be loaded dynamically.
So let's go here.
And we can see, there it is.
This is the rainbow drawing right here.
I'm going to give myself some more room,
and here's the drawing itself.
So all I need to do now is write an algorithm to go through
and draw this drawing.
All right, we're ready.
So let me make the background 200,
let me say the drawing
is
in
data.drawing,
is that right?
Console log drawing, let's look at that.
Yeah, so this is the actual drawing.
It's just two arrays 'cause it was just two strokes.
Now I am going to say for
let i = 0,
i is less than drawing dot,
oh, let me figure this out, this is an array,
oh right, oh weird.
Sorry I'm lost, oh right, okay, so, ah, the drawing,
this was only one stroke,
that's why this was confusing here.
Some of these rainbows, there we go,
this is what I want to look at.
I have three different strokes.
So first I need to look at all the strokes, sorry.
So I want to say let, and I'm going to call it a path.
So for let path of drawing, this is each and every path,
path zero, path one, path two, then each path
has a bunch of points, path zero, it has 15,
path one has 10, path two has six.
I'm going to say for let i = 0,
i is less than path,
path index 0.length 'cause this,
and then, the x is
path index 0 ,index 1,
wait no, index i, sorry, this is confusing,
and the y is path index 1, index i, right,
so this is what I'm doing.
I am looping through
zero, one, two, that's the outer loop,
each path, each path is two arrays.
Path zero is all the Xs, path 1 is all the Ys.
I need to look at all the Xs and all the Ys,
and then set a vertex X, Y, so I can say begin shape,
end shape,
I can say no fill, stroke 0,
whoops,
stroke 0, and maybe I'll say stroke
weight 3, just to make the lines a little bit thicker,
and let's see what I see.
There we go.
Rainbows, rainbows galore, these are everybody's rainbows.
Each time I hit refresh.
You know one thing I could do now is when it finishes,
I could just say load json again.
Ooh, maybe I would want to
redraw the background every time, that might make sense.
And here we go, this is a random drawing over
and over and over again, so, I could start to do things
like request a specific drawing from a certain country,
I could download different, I mean, you know,
different models, let's just, let me pause for a second
and grab another model.
Okay, so I downloaded one more set of drawings,
the cat files.
So I'm going to, the cat drawings, so I'm going to copy that
into here, and we can see now I have cat ndjson.
If I go back to my server, I could do,
I'm going to say, I'm going to call this rainbows,
and I'm going to do a different one for cat.
And I'm also going to do cats,
cats push object, and then I'm going to make another
route for,
for cats.
So now,
if I rerun the server,
and I go back to my actual sketch,
and I switch to going to the cat route,
now where was that?
Here I am, I'm going to hit enter, ooh, I got some issue.
Cat, internal server error, so what's going on here?
Drawings is not defined, so I made a mistake in my server.
Oh, this is,
over here is rainbows.length, and this is cats.length,
and I would have seen that error here
if I was paying closer attention.
There, now I've got cats, and now, ah, let's look at
a lot of cats.
Ooh.
Ooh, it's still giving me rainbows.
Did I not hit save?
Load json cat,
oh, load json cat, whatever, I'm not being too thoughtful
about this, give me the cats, I want to see the meow meow.
What's going on?
Aah.
(buzz)
this is what I get for trying to code so quickly.
This is supposed to say cat.json,
cat.ndjson, now here we go.
Oh wait, I have to restart the server.
And, here we go.
Finally, cats, there's a lot of different cat drawings.
I really should slow this down,
let me just slow this down a little bit.
Here's what I want to do actually.
Oh, this video should really be over, but why not?
You've already watched this much, you could watch
a little bit more, right?
I really want to draw the drawing in sequence.
Now, I'm not, I don't have the timing information,
and that would be useful to have,
but let's make it actually animate.
So I'm going to add a draw function.
I'm not going to add a page transition event,
and so when I've got a cat, and I'll just change this.
What I'm actually going to do
is just set
current,
I'm going to just set cat equal to data,
so I'm going to take out all of this,
cat equal to data.drawing.
So I'm going to comment this out.
Let's think about this, and then I'm going to say
let x, y, and I'm going to have,
I'm going to say if cat, then I now need to keep track
of where I am.
Let stroke index
= 0,
let pen index = 0.
So I need to keep track of two indices, right?
Because I'm going to walk through one at a time,
each vector of the first stroke, and that stroke's going to go
from zero to one and go through each of the other ones,
so if there's a cat, the first thing I need to do is say,
so if, I'm going to say x =, and what was this stuff?
It is path,
oh, drawing, so cat index,
stroke index,
index,
pen index,
index 0.
Boy, this is really awkward about how it's using just arrays
for everything, but in the first stroke,
in the first, pen is not the right term,
I don't know what to call it, vertex, but whatever,
I could actually just call this index maybe.
The stroke index and index, then 0 is for x,
and then
one is for y, and let me just, just to see that this works,
let me say point,
let me say point x, y, and these don't need to be global.
So let's see what this does.
So first of all, let's just run this.
Oh boy, I freaked it out, it won't ever stop.
I think, by the way, I killed this,
I need to build in a little more of a delay
with these API calls.
So cat is not defined, sketch.js line 12.
If cat, that needs to be a global variable.
So, and let me just say here,
console log x,y, let's see, did I get x, y?
Yes, so I've got that first point over and over again,
and presumably, 52, 48, I don't know why I don't see,
I guess I need to say stroke 0,
stroke
weight
3.
There we go, so there it is, that's the first point,
so now what I need to do is say
index ++, if index is greater than or equal
to cat stroke index
dot length,
then stroke index
++
and index = 0.
So this is me marching through them one at a time.
So, ooh, and I don't have the Y.
Right, you can see that something's wrong here.
(ding) Okay, something is terribly wrong here,
and actually, I have not been carefully looking
at how those arrays are organized, it's very confusing
to store all these data as arrays, but there are 11 strokes,
and this stroke has 23 points, this stroke has nine points,
but notice that the, I have the order wrong, right?
This is an array of an array of arrays,
and so basically, the stroke, the zero element
of the stroke is all the different x values,
and this one element of the stroke
is all the different y values.
I had those out of order, and then here,
the number of points is not the number of strokes,
but rather, the number of Xs.
So now, if I redo this,
you can see the outline of a cat there.
You can start to see the outline of a cat here.
Of course it gets stuck at the end, it's giving me an error,
so first let me fix that error.
So the error that I need to check is if stroke index
equals
cat.length,
then I'm done, then I'm going to say cat = null.
I'm going to say no more to the cat, and there we go.
So this is the drawing of the cat.
Now, of course, I'm just drawing all the points,
I need to connect the previous points to the other points,
so I'm going to add a previous x, previous y,
and then I'm going to say here down here, previous x = x,
previous y = y, and then here, I'm going to say a line
between previous x, previous y and x and y.
Now, it should do nothing when those values are null,
so now we see there, ooh, oh wait a sec,
no, no, no, no, no, no, ah, when I get to the next stroke,
then I need to say previous x = undefined again,
and previous y = un, 'cause I don't want to draw,
I don't want to connect the strokes.
So that's a little bit of an awkward way of doing it.
It's still doing that, isn't it?
So, and then I want to say if previous x,
maybe if I do this,
does not equal undefined, then draw the line,
let's see if this works, whoops, sketch line 19,
I always have this extra equals there.
Oh, weird.
It's still connecting everything, a lovely little cat there.
What am I missing?
Right, I don't want to draw the line.
Uh, these are undefined at the beginning,
oh it gets set to here, so I need an else here.
Else,
don't set it if it's at the end, okay.
There we go, finally, we are drawing cats.
Now, all I have to do is then when I reset there,
I can just ask for a new one.
So let's ask for a new cat,
and whenever I've got a cat, let's draw a white background,
let's make,
a little bit gray, we'll set it gray at the beginning,
there we go, now, here we go, we're now going to
draw lots of cats, it should finish one.
Ooh, didn't get another one, sketch on line 13.
Uh...
Cat is
undefined,
and then, there should be no more cat until I've got a cat.
Try that again.
There we go, I don't know what I did wrong.
Ooh.
(ding)
Thank you to BIMSoMe and Louise, both in the chat,
who just pointed out that my technique here is correct,
but the issue is that I need to reset everything
back to zero, so here, I need to set stroke index
back to zero, and I think index will already be zero,
yeah index is already zero, so yes, the stroke index needs
to go back to the beginning, and now I think we're ready
to enjoy a whole bunch of cats.
(upbeat music)
Cat drawings.
All right, thanks for watching this Coding Challenge
with the Google Quick, Draw dataset.
Stay tuned for a future video where I show how to,
what do I do?
This is where I show how to create new drawings
with the Sketch RNN model, the machine learning model
that was trained on these drawings,
and if this was one of your drawings, (blows kiss)
thank you for making this beautiful cat,
and I'll see you in a future Coding Challenge.
Good bye (ding).
(upbeat music)