Subtitles section Play video Print subtitles (bell dings) - Hello, and welcome to word2vec tutorial number three. I am not yet actually to the part where I'm going to use the word2vec algorithm model itself. I'm still just in a place where I'm looking at, okay, well I have words, and I have numbers. What does that mean, what kind of things can I do with that? And the scenario that I'm using, which is from Allison Parrish's excellent understanding word vectors tutorial, available under the Creative Commons 4.0 License, please if you're basing stuff off what I'm doing, also attribute this. The scenario we're looking at is colors. You can find that down here. So what I have, is I have a p5 sketch, which is loading this color database of the 954 most common RGB monitor colors from an XKCD survey, The previous tutorial. So here's my idea, this is what I'm going to do in this video. I am going to take some text from the rainbow Wikipedia page, And I'm going to say, let's sprinkle some red on it, or sprinkle some blue, or sprinkle some of this color. What does it mean to add some color to it, and how does the text change? So, I've written a tiny bit of code here, basically nothing, I introduced from the previous video, I introduced a variable called lines. I'm loading this text file, where I just copy, paste the Wikipedia content into. You can think of a different way of doing this. and then, now what I'm going to do in setup is, I have a global variable called rainbow. I'm going to say rainbow equals join. When you load a text file in p5 with loadStrings, it gives you an array, where every line is in a different element of the array, so I'm going to say join lines, and I'm going to give it a delimiter, like the br tag, and then I'm just going to now add setup, I'm going to say createP(rainbow). So right now we should have, if I go here, is we should see, and I'm going to say no canvas, although I might want actually use the canvas a little later, we can see here I have all that text. Now, what I want to do actually is, I want to highlight and color anything in the text that appears in the X, in the color database. So I think actually, what I want to do is first split this up into words, so I'm going to use rainbow.split, and you could check out my tutorials about regular expressions and word counting and all that stuff where I do this a lot. I could definitely come up with a better regular expression, but I'm just going to split by anything that's not a character that's A through Z, or zero through nine. So I'm going to do this, then I'm going to say i equals zero, i is less than words.length, I++, and I'm going to say createSpan(words[i]). The br stuff is going to mess up. I'm going to say this, or the br. Will this work? Oh boy, didn't like that, did it? The br stuff is going to mess that up, so I'm just going to make that a space instead, and now if I refresh this page, there, we can see all the words. Whoo! Oh, woops. All the words are together in one line, I need to put spaces between them. This is kind of a terrible idea, but I'll just do this, and now, okay, so here's the thing. So now, here's the text. What if one of those words appears in the color database? So let's look. Let me just say, and actually another way that I could do this, like use a for of loop, let word of words, and then, it's just a little bit nicer to do it this way, If (vectors[word]), if it exists, then what do I want to do? I want to get the color. I want to say, let color equals that (vectors[word]). And then I want to say, I'm going to say let span equals create span. And I going to say span.style I'm using. Once again, the p5 dom library background color. Wish I'd save my code from the previous video, going to write exactly the same thing, r,g,b and then I'm going to take sorry, c.x, c.y, and c.z, c.y and c.z. So let's see if now, we can see some of those highlighted, there, great. So anything that was in that database, I am now highlighting sky, sky, red, violet right. Now the other thing I might as well do that is I might as well store those things. So let me keep track of color spans as an array. And I'm going to add span into this array. And now I can start doing stuff. So what might I want to do? First of all, one thing I could do, which is kind of interesting is, let me just get the average color of all those things. So what if I say, I guess I'll say, let keys equal keys. I'm also going to save all of those. The word. So let's just take a look. I'm just curious here, console.log keys. So we can see these are the only things in this text that matched and again I should have checked for two pairs of words and things. I'm missing a lot of steps here. But you can improve my code. That would be wonderful. Make your own version of this. But let's at least get the average color here. So let's see what the average color is. So now I can say average equals create vector 000. So any math that you can do with numbers, you can now do with those words, because now I can say, let key of keys and I can say V is vectors, that's associated with that key, average.add and then at the end, I can say average.divide the length of all those keys. Then all I need to do is, find nearest that particular vector, and then console log nearest purplish. So the average color of this text and this is directly again from Allison's tutorial, she has something very similar in it is purplish and if I wanted to now I could create a canvas. You know very small one. 50 comma 50, and then I could say, background avg.x avg.y avg.z and we would see there it is. So that's the average color and it's label is purplish. Now what if what I wanted to do was actually add some color or subtract some color and change the text. So let's try to do this dynamically, this is going to be hard but this will be extra fun. (laughs) So I'm going to create three sliders. R equals r slider. R slider, g slider, b slider and I'm going to hope that one of you watching this video and gets inspired to make a really interesting wonderful interface. I'm going to completely ignore anything about interface design here and just kind of do this raw. r slider equals create slider between zero and 255. And starting with zero and I'ma do the same thing. And again these are p5 functions and I could obviously just write these down into the HTML directly. But now I have three sliders. And what I want to do is, anytime I change any of these sliders, so I'm going to say r slider input slider changed. I'm going to call the same function if any of the sliders are changed. So I'm going to sign this slider changed event to all of these. And then, I'm going to write this function slider changed and now I need to get the values. r equals rSlider.value, g equals gSlider.value, and b equals bSlider.value. Now, let me just make sure this is working, console dot log r,g,b. Okay, so let me run this. And now as I move these sliders, you can see anytime I move the slider, I need these sliders. I'm getting this color value. Now I'm going to take that color value and add it to these colors and then have the words change with the new color matting. I probably should make it so I could subtract color too. So let's actually make it let's just make it, so I can add or subtract some amount between negative 100 and 100. Negative 100 and 100, and negative 100 and 100. OK, so now in this slider change what I'm going to do is I'm going to go through all of the spans and keys I guess. I need to, I could probably be more thoughtful. We'll refactor this later. (bell ringing) But then make these global variables and I'm going to go through all of the spans, let span of colorSpans, and I'm going to get the key. The word is the HTML of that span actually you don't have to look it up. Otherwise, then the vector is the vector associated with that word. Then I'm going to say, vector add r,g,b. I think this will work, right? This, this takes that vector and adds these amounts to it. Then I need to say nearest equals find nearest. Oh, I need to make a copy of it. So I don't actually want to change that vector. I need to make a copy of it. So I'm going to say copy because I'm pulling just a reference to the vector that's in that object, and you don't want to adjust it. That's the actual vector associated word. I want a copy of it that I'm going to mess with. And then I want to find nearest to that. And then I'm going to say span html that word. So look at this. These are, let's take a look at these. These are all the color words. Let me add some more red to them. This doesn't seem to be working. I was so excited to see this work. What did I do wrong? Find V span h2 oh, nearest, nearest, nearest. Right, word is the original word. Nearest is the new one. After I added that color. I think this is going to work. It's very exciting. So I need more space. Okay, ready? Here we go. I add a lot of red and I got bright red and bright magenta. Let's add a lot of green I got white already. And a lot of off white white right. If I add, take away, let's subtract color. I subtract color, now I've got black, everything is black. So look at this, you could take, you could take a whole text any novel you like, you could say, let's just re-write it with a little bit more red and find any instance of anytime a color is referenced. And guess what, when we do this with a more generalized word effects system, we could actually add, start to add this idea of add red to words that aren't actually colors. We might have a corpus that includes way more than just these, you know, 496 or whatever that number is color values. So hopefully you're starting to see. The idea here is that with the idea of word embedding, with words associate with vectors with text associated with numbers that if I can do math to the numbers, I can go translate back into words, I can always translate those words as numbers, do more math and translate back and do all sorts of strange transformations like this. So make a better version of this. Think about what texture you're using, fix that problem for me. Where I look also for like pairs of words that match. Make a nice interface. I don't know, think about the way you design this. There's so many possibilities. I look forward to seeing what you make with this and see you in the next video where I actually now go to look at the ml5 library which has a built in class. Built in basically feature that allows you to work with word2vec without having to write all the math yourself which is what I'm mostly doing in this particular video. Okay, goodbye. Oh, change the background as well. Stop, hold the presses. I totally forgot that I must also change, change this as well. V.x, v.y, v.z. Here comes. Now as I'm adding the colors it's also changing what's actually the background color as well. We can see that that's actually changing as well. Which is a nicer way of looking at it. Okay now I'm really going, goodbye. (lips smacking) (bell ringing) (upbeat music)
A2 slider nearest vector span average rainbow 12.3: Color Vectors cont'd - Programming with Text 0 0 林宜悉 posted on 2020/03/28 More Share Save Report Video vocabulary