Subtitles section Play video Print subtitles We'll talk about, uh, Deep Learned Super-Sampling. So yeah, it's got a fancy name. It sounds cool. It is quite cool. Let's imagine you're running a game, right? I don't do that as often as I'd like anymore. But- but- maybe you're pushing your graphics card right to the limit of where it- where it's happy, right? You've got a 4K monitor, you're pushing a lot of pixels to the screen, the game is a big game with lots of- lots of effects. The problem is that then your frame rate's going to start suffering. So maybe what we could do is run at a lower frame rate, which is going to be that much easier for your graphics card, and then use deep learning to recreate that 4K image - and if the deep learning is good enough, you won't be able to tell the difference. So the most recent generation of Nvidia graphics cards have these Tensor Cores on board, right? Tensor Cores are basically very quick matrix multiplication circuitry. Matrix multiplication comes up a lot in deep learning, which is kind of what these were designed around. But it has some applications to games, a little bit. I'm going to betray my lack of knowledge about modern games, but I don't really know much about them. I play some games, but very easy ones. So, I- you know, I'm not running games at 4K and worrying about framerates, right? But some people do concern themselves with these things and they spend a lot of money on these graphics cards, and they want them to run as well as they can. The problem is that maybe some game comes out, and it has a huge demand on your- on your GPU, right? For every pixel in your game. The GPU's gotta work out which triangles in the world it needs to - you know - it needs to render what color they're going to be, it's got to include lighting and shadows, you know, blurs and - you know - depth of field effects, like I was talking about last- last video. You know, this takes a long time and the more pixels you use, the worse it gets. OFF-SCREEN: And motion blur, of course? And motion blur, I know people love motion blur. I myself can't get enough of it. What do you do about this? One thing you do is you just make your graphics cards faster, right? This is something that happens every generation, not just on NVIDIA cards, but - you know - all graphics cards, and that helps a lot. All right, but the problem is we've had a jump from 1080P To 4Kand that is not a slight increase in number of pixels It's four times the number of pixels which on a simple level means four times the amount of work right to render This this screen and you've got it all so when we talk about things like super so Something sometimes we're looking at a pixel more than once right and that means it's getting really slow So what Nvidia have tried to do here is Say well Maybe we can we can we can save quite a lot of performance by let's say running our game at 1080p, right? But if you up sample that to 4k, maybe it won't look very good because basically it's just going to be blurred so maybe we could do a slightly more a slightly smarter up sampling using these sort of deep learns to prepare solution techniques as I understand it even modern TVs these days do Scaling and they're scale things up and blu-ray recorders DVD recorders are always You know had an element of doing this scaling is this just a more advanced version? Yeah, and some TVs have starting to bring in as far as I know Deep learned, you know smart AI driven up sampling right the idea Is that so what happens when you run a game is you've give adding all these effects on top of one another to? Get a good performance now if your performance if your frame rate starts to drop What you're going to do probably is either drop your resolution or if you don't want to drop your resolution because you don't like it You start to remove some of the effects, right? So you drop your shaders down from high to? Something else and that will just reduce the amount of work per pixel and if slightly increase your frame rate But some people don't want to do this Right, they spent a lot of money on their computer and they want to run it on full graphics So but maybe a game has come out. There's just really really demanding There's kind of two problems we want to solve here one Is this problem of aliasing, right which is but if you rent if you rasterize a scene with triangles They don't always line up exactly where the pixels are. So you get a kind of jagged edge, right? That's one problem, which doesn't look very nice and there are techniques to get around this the other problem Is this issue of resolution, right? If you drop down from 4k to 1080p you gain months, you know gone four times faster That's a huge benefit. Right? If you could do that without noticing and the difference in appearance Well, that's a winner right that's going to be great. And then you can start putting it even even more Computational time on shader effects and things so but yes, you're running out a lower resolution But those pixels are better had a good deal time spent on them. They look really nice What are the problems we have? Is this problem called aliasing right now? I will just talk about this very briefly because it's not really what this video is about But if you're valuing an image and your triangle Falls like this, then this pixel actually isn't all of this object here Maybe this whole dish is dark and this object is light. It's not all of this object It's like 70% dark and 30% light now. The problem is that there's no way of doing this So if you saw and pull this pixel here this pixel here and this pixel here. You're going to get an edge that goes light Dark light like this, right and that looks ugly So what you'd use is a technique usually called some multisample Anti-aliasing where essentially you take multiple readings from this pixel like these four here or something more dense than that And there's lots of different ways to do this and then you have those values and the nice thing is then you've got three readings of dark one reading of light and you come out with a reading of About 75% dark and you get a smooth edge This is one thing that graphics cards do to try and make things look a little bit better if you turn off all these Anti-aliasing approaches, then what you'll get is your your core jaggedy lines? It'll run nice and quickly. If you're sampling four times per pixel, that could be a four-fold decrease in speed Right and that has a performance hit unless your graphics card is just amazing. Mine is not that's one problem, right? the other problem is but you know when you go up to 4k It's just four times the number of pixels whatever you're doing per pixel is Multiplied by four these four times in our pixels that you know, that's a huge problem. So if you're running 4k and four samples for example per pixel That's a lot more computation but if you were just running without anti-aliasing on 1080p and so you inevitably have to drop down somebody's setting so that you can get a good framerate for your game And it doesn't look as nice. So one option is just to make the graphics cost faster, right? This can't always, you know, this isn't the answer to everything and they get more more expensive, right? That's also a problem these new cards have in them these tentacles which are very good specific operations namely matrix multiplications and So there is a chance of what we could do is use a deep network to kind of just clean this image up for us Very quickly before it's presented to the screen and that will give us a nice presentation for our image without the huge Performance hit of doing all this anti-aliasing and all of this massive resolution, right So what board is speaking it works like this your computer? will render a raw frame with Aliasing and all the kind of problems that it has and this may or may not be at the same resolution as your monitor It will then be passed through a deep network which is not very deep because this has got to happen pretty quickly which utilizes the tensor cause of these new graphics cards to work very very quickly and that will Produce your knives 4k with anti-aliasing shot Which theoretically looks really really nice the way they train this network when they take the lower resolution alias version and the higher resolution Anti-alias version is they're going to train it basically to try and create the image on a pixel level as closely as possible But they'll also add additional parameters like that. It looks perceptually nice high So basically perceptual loss functions which try and make things look aesthetically closer to something right now Different loss functions are going to have different effects. So we trying out all these different loss functions They might even use adversarial loss functions which are these adversarial networks of what Mars talked about, right? There's loads of different ways to train these and how you do that. It's going to influence the actual result you get All right, because it's not worth going to work perfectly. So there's kind of two answer answer questions here I mean firstly does this work Right and and the other might personally I don't know because I don't haven't tried this right but I think it the results vary right? That's certainly true But also how do we train this new your network, right? Because what you don't want to have happen right is like the fate you're unlocking the phone thing What was it unlocking your face with your phone if you to unlock a face of your phone? You don't want users to have to do this, right? This is something for NVIDIA to do if they're gonna you know Silvus and make money off this technology and that's exactly what happens sometimes shortly before games release the game developers will send an early copy to a midea an Nvidia will start generating training data and train a neural network to do this process to take an image, but isn't quite as nice It's got aliasing it's lower resolution and perform this up sampling smart up sampling up to 4k anti-aliased right that's the idea and they do this by generating essentially a Perfect representation of the game using 64 samples per pixel anti-aliasing, right? So that is for every pixel they do 64 samples from that pixel instead of just one really nice sixty-four times slower than normal and then they take that as their Output and the input is just serve all frame with no anti-aliasing at all Maybe lower resolution and they train this network to take the raw frame and output the 64 Samples per pixel really nice frame, right? And so really what it comes down to is whether in practice this works Right and the answer I think is probably some of the time yes some of the time no This is true of most deep learning right people don't tend to say this as much as they should but you know will it generalize if you take 10 million frames of battlefield 5 and train this network on them to get as close to this output as possible and Then you generate the 10 million from one. Don't even want the next frame, right? if you generate the next frame, will it have as good a Performance on that unseen frame and the answer is usually pretty good, but it won't be ever be perfect right, especially if you're going from 1080p to 4k so I think NVIDIA kind of made the point here that actually this is about when you're running at the very top end of your graphics cards capability and so in some sense They're not talking about people who are barely struggling to run the game at 1080p You should already barely run the game at 4k and then maybe this will make it look slightly nicer first kind of two ways of Doing this one is you take a 4k input and you use this to perform Anti-aliasing and the other is you take a low resolution input and you use this to perform both Anti-aliasing and up sampling and that's a harder job to do because if you imagine that you've got a 1080p saw Then what actually you're going to have is a series of pixels like this and you've got to come up with all of these Pixels in between right and this is just like increasing the size of an image enhance enhance by you know, will it work? I don't know. It's going to be better than Violent bicubic up sampling, right? Because it's going to be bearing in mind this local area. It's going to say well look there's an edge coming down here So this needs to be sharper, this doesn't need to be as sharp things like this But this is not an easy problem to solve and you know by Nvidia's own admission. This is an ongoing process they continually train these networks on a supercomputer and then You know, hopefully they get better and better we shall see right well One thing I think is quite interesting is that it means that essentially a deep network is part of your game experience On your GPU and so the weights for this network The parameters of this network are actually going to be shipped out with drivers, which I think is quite neat, right? So you're no longer just getting graphics drivers which have performance optimizations for games? and of course the hardware the hardware control software, you've also got this network weights being shipped around by they're quite big so you know and So that's why you get limited support for games early on because they're training these maybe they haven't gotten early copy of the game right So it's it's it's down to Nvidia to just take these games render these super high resolution 64 times amazing scenes and train these networks strikes me that even if it's been trained Running a network a deep network is not computationally cheap though. Is it is it worth it? I suppose what's the trade-off so I mean, I guess that's the question people are asking at the moment, right? so the tray office so the nice thing about a neural network is it takes an exact same amount of time every time right on these tensor cause There is a fixed amount of time It takes to take an image of a certain resolution and output this image of another resolution some amount of milliseconds So that is per frame a fixed load that's going to have happen games aren't merely a fixed load they take different amounts of time depending on what's in the scene and The argument basically is if yours graphics card is struggling you can drop from 4k to 1080p For a massive increase in performance and then decrease the performance slightly by attacking this neural network on the end Right, but your overall performance will be better for it that that's the idea So if you can already run at 6 at 4k my eye with 60 frames a second Very little reason to add this on right which is why sometimes it gets disabled in the options I your computer is already fine running this game Dropping down to 1080p is only going to make you experience worse. Don't bother doing it I guess the question is how does this? Network actually look right and it's something called an auto encoder or I would call this an encoder decoder, right? I talked about these before but you have an image coming in You have some amount of network, which is going to perform down sampling It's going to make the image smaller but it's also going to Learn interesting things about that image as it does it so this image is going to get down sampled through some Network layers Down to about half resolution something like that And then it's going to get up sample back to 4k or whatever. The resolution of the output is like this now It's quite typical in these kind of networks to go much further than this right normally So in a network I use to do something like this would go down to a few pixels wide because I'm using it to segment objects and So this network won't learn where all the people are because it doesn't go deep enough, right, but it will learn on a local level Kind of what's going on. This is a dark edge here. This is a light bit It's a bit of sunlight coming in, you know And it can start to piece some of these things together and work out in a slightly Smart way what these pixels are going to be doing, right? The other nice thing about this being only a few layers followed by quite a high resolution image followed by a few layers This is gonna be quite fast, right? I mean let's not underestimate how much computation this involves like if this is 1080p or 4k what a staggering amount of maps has to happen very very quickly But that's exactly what these tens of calls are for they perform extremely fast 4x4 matrix multiplications and additions Which is exactly what this generalizes into so you essentially pass over the image performing these matrix multiplications Over these layers and it happens really really fast We're gonna see more and more of this kind of stuff. So yes, this is one way of doing it It's used in meandering for four denoising of ray tracing in in, you know, big movies like Pixar movies and things It's using up sampling on TVs Using a sash a low but powerful deep network to try and tidy up something but it's not perfect Best is going to happen a lot, right? We've already seen these these gams these generative adversarial networks turning up that are trying to produce new people's faces and things This is a big deal at the moment and it's going to be a lot of it So, you know in video have started this process But we're going to see more and more and I'll imagine it will become a kind of standard approach You know in a few years time It is staggering that this is happening 60 times a second Yeah, I mean III think that the school should bomb one of these cards and we'll give it a fire test We just need to test it We just need to test it on, you know, all these games. So it copies of his games to please Copies of the games a machine to run them on one of these Packers cards and I'll do a very thorough in-depth research on it But probably a similar word was in the dictionary and he got manipulated in some way Some letters got swapped around and suddenly it was cracked. We've had some luck. We've done a bit of brute force We've done a basic dictionary attack We have a few rules just to mix it up and we've got some passwords so far I've cracked I think about 1,700 passwords out of about six and a half pounds
A2 network resolution pixel sampling nvidia image Deep Learned Super-Sampling (DLSS) - Computerphile 2 0 林宜悉 posted on 2020/03/27 More Share Save Report Video vocabulary