Subtitles section Play video
We'll talk about, uh, Deep Learned Super-Sampling.
So yeah, it's got a fancy name. It sounds cool. It is quite cool.
Let's imagine you're running a game, right?
I don't do that as often as I'd like anymore.
But- but- maybe you're pushing your graphics card right to the limit of where it- where it's happy, right?
You've got a 4K monitor, you're pushing a lot of pixels to the screen,
the game is a big game with lots of- lots of effects.
The problem is that then your frame rate's going to start suffering.
So maybe what we could do is run at a lower frame rate,
which is going to be that much easier for your graphics card, and then use deep learning
to recreate that 4K image - and if the deep learning is good enough, you won't be able to tell the difference.
So the most recent generation of Nvidia graphics cards have these Tensor Cores on board, right?
Tensor Cores are basically very quick matrix multiplication circuitry.
Matrix multiplication comes up a lot in deep learning, which is kind of what these were designed around.
But it has some applications to games, a little bit.
I'm going to betray my lack of knowledge about modern games, but I don't really know much about them.
I play some games, but very easy ones.
So, I- you know, I'm not running games at 4K and worrying about framerates, right?
But some people do concern themselves with these things and they spend a lot of money on these graphics cards,
and they want them to run as well as they can.
The problem is that maybe some game comes out,
and it has a huge demand on your- on your GPU, right? For every pixel in your game.
The GPU's gotta work out which triangles in the world
it needs to - you know - it needs to render what color they're going to be,
it's got to include lighting and shadows,
you know, blurs and - you know - depth of field effects, like I was talking about last- last video.
You know, this takes a long time and the more pixels you use, the worse it gets.
OFF-SCREEN: And motion blur, of course?
And motion blur, I know people love motion blur. I myself can't get enough of it.
What do you do about this? One thing you do is you just make your graphics cards faster, right?
This is something that happens every generation, not just on NVIDIA cards, but - you know - all graphics cards, and that helps a lot.
All right, but the problem is we've had a jump from 1080P
To 4Kand that is not a slight increase in number of pixels
It's four times the number of pixels which on a simple level means four times the amount of work right to render
This this screen and you've got it all so when we talk about things like super so
Something sometimes we're looking at a pixel more than once right and that means it's getting really slow
So what Nvidia have tried to do here is
Say well
Maybe we can we can we can save quite a lot of performance by let's say running our game at 1080p, right?
But if you up sample that to 4k, maybe it won't look very good because basically it's just going to be blurred
so
maybe we could do a slightly more a
slightly smarter up sampling using these sort of deep learns to prepare solution techniques as I understand it even modern TVs these days do
Scaling and they're scale things up and blu-ray recorders DVD recorders are always
You know had an element of doing this scaling is this just a more advanced version?
Yeah, and some TVs have starting to bring in as far as I know
Deep learned, you know smart AI driven up sampling right the idea
Is that so what happens when you run a game is you've give adding all these effects on top of one another to?
Get a good performance
now if your performance if your frame rate starts to drop
What you're going to do probably is either drop your resolution or if you don't want to drop your resolution because you don't like it
You start to remove some of the effects, right?
So you drop your shaders down from high to?
Something else and that will just reduce the amount of work per pixel and if slightly increase your frame rate
But some people don't want to do this
Right, they spent a lot of money on their computer and they want to run it on full graphics
So but maybe a game has come out. There's just really really demanding
There's kind of two problems we want to solve here one
Is this problem of aliasing, right which is but if you rent if you rasterize a scene with triangles
They don't always line up exactly where the pixels are. So you get a kind of jagged edge, right?
That's one problem, which doesn't look very nice and there are techniques to get around this the other problem
Is this issue of resolution, right? If you drop down from 4k to 1080p you gain months, you know gone four times faster
That's a huge benefit. Right? If you could do that without noticing and the difference in appearance
Well, that's a winner right that's going to be great. And then you can start putting it even even more
Computational time on shader effects and things so but yes, you're running out a lower resolution
But those pixels are better had a good deal time spent on them. They look really nice
What are the problems we have? Is this problem called aliasing right now?
I will just talk about this very briefly because it's not really what this video is about
But if you're valuing an image and your triangle Falls like this, then this pixel actually isn't all of this object here
Maybe this whole dish is dark and this object is light. It's not all of this object
It's like 70% dark and 30% light now. The problem is that there's no way of doing this
So if you saw and pull this pixel here this pixel here and this pixel here. You're going to get an edge that goes light
Dark light like this, right and that looks ugly
So what you'd use is a technique usually called some multisample
Anti-aliasing where essentially you take multiple readings from this pixel like these four here or something more dense than that
And there's lots of different ways to do this and then you have those
values and the nice thing is then you've got three readings of dark one reading of light and you come out with a reading of
About 75% dark and you get a smooth edge
This is one thing that graphics cards do to try and make things look a little bit better if you turn off all these
Anti-aliasing approaches, then what you'll get is your your core jaggedy lines?
It'll run nice and quickly. If you're sampling four times per pixel, that could be a four-fold decrease in speed
Right and that has a performance hit unless your graphics card is just amazing. Mine is not that's one problem, right?
the other problem is but you know when you go up to 4k
It's just four times the number of pixels whatever you're doing per pixel is
Multiplied by four these four times in our pixels that you know, that's a huge problem. So if you're running
4k and four samples for example per pixel
That's a lot more computation
but if you were just running without anti-aliasing on
1080p and so you inevitably have to drop down somebody's setting so that you can get a good framerate for your game
And it doesn't look as nice. So one option is just to make the graphics cost faster, right?
This can't always, you know, this isn't the answer to everything and they get more more expensive, right?
That's also a problem
these new cards have in them these tentacles which are very good specific operations namely matrix multiplications and
So there is a chance of what we could do is use a deep network to kind of just clean this image up for us
Very quickly before it's presented to the screen and that will give us a nice presentation for our image without the huge
Performance hit of doing all this anti-aliasing and all of this
massive resolution, right
So what board is speaking it works like this your computer?
will render a raw frame with
Aliasing and all the kind of problems that it has and this may or may not be at the same resolution as your monitor
It will then be passed through a deep network
which is not very deep because
this has got to happen pretty quickly which utilizes the tensor cause of these new graphics cards to work very very quickly and that will
Produce your knives 4k with anti-aliasing shot
Which theoretically looks really really nice the way they train this network when they take the lower resolution alias version and the higher resolution
Anti-alias version is they're going to train it basically to try and create the image on a pixel level as closely as possible
But they'll also add additional parameters like that. It looks perceptually nice high
So basically perceptual loss functions which try and make things look aesthetically closer to something right now
Different loss functions are going to have different effects. So we trying out all these different loss functions
They might even use adversarial loss functions which are these adversarial networks of what Mars talked about, right?
There's loads of different ways to train these and how you do that. It's going to influence the actual result you get
All right, because it's not worth going to work perfectly. So there's kind of two answer answer questions here
I mean firstly does this work
Right and and the other might personally
I don't know because I don't haven't tried this right but I think it the results vary right? That's certainly true
But also how do we train this new your network, right?
Because what you don't want to have happen right is like the fate you're unlocking the phone thing
What was it unlocking your face with your phone if you to unlock a face of your phone?
You don't want users to have to do this, right?
This is something for NVIDIA to do if they're gonna you know
Silvus and make money off this technology and that's exactly what happens
sometimes shortly before games release the game developers will send an early copy to a midea an
Nvidia will start generating training data and train a neural network to do this process to take an image, but isn't quite as nice
It's got aliasing it's lower resolution and perform this up sampling smart up sampling up to 4k
anti-aliased
right
that's the idea and they do this by generating essentially a
Perfect representation of the game using 64 samples per pixel anti-aliasing, right?
So that is for every pixel they do 64 samples from that pixel instead of just one really nice
sixty-four times slower than normal and then they take that as their
Output and the input is just serve all frame with no anti-aliasing at all
Maybe lower resolution and they train this network to take the raw frame and output the 64
Samples per pixel really nice frame, right?
And so really what it comes down to is whether in practice this works
Right and the answer I think is probably some of the time yes some of the time no
This is true of most deep learning right people don't tend to say this as much as they should but you know
will it generalize if you take
10 million frames of battlefield 5 and train this network on them to get as close to this output as possible and
Then you generate the 10 million from one. Don't even want the next frame, right?
if you generate the next frame, will it have as good a
Performance on that unseen frame and the answer is usually pretty good, but it won't be ever be perfect
right, especially if you're going from 1080p to 4k
so I think NVIDIA kind of made the point here that actually this is about when you're running at the very top end of your
graphics cards capability and so in some sense
They're not talking about people who are barely struggling to run the game at 1080p
You should already barely run the game at 4k and then maybe this will make it look slightly nicer first kind of two ways of
Doing this one is you take a 4k input and you use this to perform
Anti-aliasing and the other is you take a low resolution input and you use this to perform both
Anti-aliasing and up sampling and that's a harder job to do because if you imagine that you've got a 1080p saw
Then what actually you're going to have is a series of pixels like this and you've got to come up with all of these
Pixels in between right and this is just like increasing the size of an image enhance enhance by you know, will it work?
I don't know. It's going to be better than
Violent bicubic up sampling, right?
Because it's going to be bearing in mind this local area. It's going to say well look there's an edge coming down here
So this needs to be sharper, this doesn't need to be as sharp things like this
But this is not an easy problem to solve and you know by Nvidia's own admission. This is an ongoing process
they continually train these networks on a supercomputer and then
You know, hopefully they get better and better we shall see right
well
One thing I think is quite interesting is that it means that essentially a deep network is part of your game experience
On your GPU and so the weights for this network
The parameters of this network are actually going to be shipped out with drivers, which I think is quite neat, right?
So you're no longer just getting graphics drivers which have performance optimizations for games?
and of course the hardware the hardware control software, you've also got this network weights being shipped around by they're quite big so
you know and
So that's why you get limited support for games early on because they're training these maybe they haven't gotten early copy of the game
right
So it's it's it's down to Nvidia to just take these games render these super high resolution
64 times amazing scenes and train these networks strikes me that even if it's been trained
Running a network a deep network is not computationally cheap though. Is it is it worth it?
I suppose what's the trade-off so I mean, I guess that's the question people are asking at the moment, right?
so the tray office so
the nice thing about a neural network is it takes an exact same amount of time every time right on these tensor cause
There is a fixed amount of time
It takes to take an image of a certain resolution and output this image of another resolution some amount of milliseconds
So that is per frame a fixed load that's going to have happen games aren't merely a fixed load
they take different amounts of time depending on what's in the scene and
The argument basically is if yours graphics card is struggling you can drop from 4k to 1080p
For a massive increase in performance and then decrease the performance slightly by attacking this neural network on the end
Right, but your overall performance will be better for it that that's the idea
So if you can already run at 6 at 4k my eye with 60 frames a second
Very little reason to add this on right which is why sometimes it gets disabled in the options
I your computer is already fine running this game
Dropping down to 1080p is only going to make you experience worse. Don't bother doing it
I guess the question is how does this?
Network actually look right and it's something called an auto encoder or I would call this an encoder decoder, right?
I talked about these before but you have an image coming in
You have some amount of network, which is going to perform down sampling
It's going to make the image smaller
but it's also going to
Learn interesting things about that image as it does it so this image is going to get down sampled through some Network layers
Down to about half resolution something like that
And then it's going to get up sample back to 4k or whatever. The resolution of the output is like this now
It's quite typical in these kind of networks to go much further than this right normally
So in a network I use to do something like this would go down to a few pixels wide
because I'm using it to segment objects and
So this network won't learn where all the people are because it doesn't go deep enough, right, but it will learn on a local level
Kind of what's going on. This is a dark edge here. This is a light bit
It's a bit of sunlight coming in, you know
And it can start to piece some of these things together and work out in a slightly
Smart way what these pixels are going to be doing, right?
The other nice thing about this being only a few layers followed by quite a high resolution image followed by a few layers
This is gonna be quite fast, right?
I mean
let's not underestimate how much computation this involves like if this is
1080p or 4k what a staggering amount of maps has to happen very very quickly
But that's exactly what these tens of calls are for they perform extremely fast 4x4 matrix multiplications and additions
Which is exactly what this generalizes into so you essentially pass over the image performing these matrix multiplications
Over these layers and it happens really really fast
We're gonna see more and more of this kind of stuff. So yes, this is one way of doing it
It's used in meandering for four denoising of ray tracing in in, you know, big movies like Pixar movies and things
It's using up sampling on TVs
Using a sash a low but powerful deep network to try and tidy up something but it's not perfect
Best is going to happen a lot, right?
We've already seen these these gams these generative adversarial networks turning up that are trying to produce new people's faces and things
This is a big deal at the moment and it's going to be a lot of it
So, you know in video have started this process
But we're going to see more and more and I'll imagine it will become a kind of standard approach
You know in a few years time
It is staggering that this is happening 60 times a second
Yeah, I mean III think that the school should bomb one of these cards and we'll give it a fire test
We just need to test it
We just need to test it on, you know, all these games. So it copies of his games to please
Copies of the games a machine to run them on one of these Packers cards and I'll do a very thorough in-depth research on it
But probably a similar word was in the dictionary and he got manipulated in some way
Some letters got swapped around and suddenly it was cracked. We've had some luck. We've done a bit of brute force
We've done a basic dictionary attack
We have a few rules just to mix it up and we've got some passwords so far
I've cracked I think about 1,700 passwords out of about six and a half pounds