How Nvidia DLSS 3 works and why FSR can't catch up just yet |  Digital trends

How Nvidia DLSS 3 works and why FSR can’t catch up just yet | Digital trends

Nvidia’s RTX 40-series graphics cards are coming in a few weeks, but among all the hardware improvements is what could be Nvidia’s golden egg: DLSS 3. It’s more than just an update. update of Nvidia’s popular DLSS (Deep Learning Super Sampling) feature. , and it could end up defining Nvidia’s next generation much more than the graphics cards themselves.

AMD has worked hard to get its FidelityFX (FSR) super resolution on par with DLSS, and over the past few months it’s been a success. DLSS 3 appears to be changing that dynamic – and this time FSR might not be able to catch up anytime soon.

How DLSS 3 works (and how it doesn’t)

Nvidia

You’d be forgiven for thinking DLSS 3 is a whole new version of DLSS, but it’s not. Or at least it’s not entirely new. The backbone of DLSS 3 is the same super-resolution technology that’s available in DLSS titles today, and Nvidia will likely continue to improve upon it with new releases. Nvidia says you’ll now see the super-resolution part of DLSS 3 as a separate option in the graphics settings.

The new part is frame generation. DLSS 3 will generate an entirely unique image every two frames, essentially generating seven out of eight pixels you see. You can see an illustration of this in the flowchart below. In the case of 4K, your GPU only renders the pixels for 1080p and uses this information not only for the current frame but also for the next frame.

A graph showing how DLSS 3 reconstructs frames.
Nvidia

Image generation, according to Nvidia, will be a separate switch from super resolution. This is because frame generation currently only works on RTX 40-series GPUs, while super resolution will continue to work on all RTX graphics cards, even in games that have been updated to DLSS. 3. It should go without saying, but if half of your images are fully generated, it will boost your performance by a lot.

Frame generation isn’t just some secret AI sauce, though. In DLSS 2 and tools like FSR, motion vectors are a key input for scaling. They describe where objects move from frame to frame, but motion vectors only apply to the geometry of a scene. Elements that do not have 3D geometry, such as shadows, reflections, and particles, have traditionally been masked from the scaling process to avoid visual artifacts.

Graphical scrolling motion through Nvidia's DLSS 3.
Nvidia

Masking isn’t an option when an AI generates an entirely unique image, which is where the RTX 40-series GPUs optical flow accelerator comes in. It’s like motion vector, except that the graphics card tracks the movement of individual pixels from frame to frame. This optical flow field, along with motion vectors, depth, and color, contribute to the AI-generated frame.

Sounds like all the benefits, but there’s a big problem with AI-generated frames: they increase latency. The AI-generated image never passes through your PC – it’s a “fake” image, so you won’t see it on traditional fps readouts in games or tools like FRAPS. Thus, the latency does not decrease despite the number of additional frames, and due to the optical flow computational overhead, the latency actually increases. For this reason, DLSS 3 requires Nvidia Reflex to compensate for the higher latency.

Normally, your CPU stores a render queue for your graphics card to ensure that your GPU never waits for work to be done (which would cause stuttering and frame rate drops). Reflex removes the render queue and synchronizes your GPU and CPU so that as soon as your CPU can send instructions, the GPU starts processing them. When applied on top of DLSS 3, Nvidia says Reflex can sometimes even result in reduced latency.

Where AI makes the difference

AMD’s FSR 2.0 doesn’t use AI, and as I wrote a while back, it proves you can get the same quality as DLSS with algorithms instead of machine learning. DLSS 3 changes that with its unique frame generation capabilities, as well as the introduction of optical flow.

Optical flow isn’t a new idea – it’s been around for decades and has applications in everything from video editing apps to self-driving cars. However, computing optical flow with machine learning is relatively new due to the increase in datasets on which to train AI models. The reason you would want to use AI is simple: it produces fewer visual errors with enough training, and it doesn’t have as much overhead when running.

DLSS runs at runtime. It is possible to develop an algorithm, without machine learning, to estimate how each pixel moves from frame to frame, but it is computationally expensive, which defeats the purpose of oversampling in first place. With an AI model that doesn’t require a lot of power and enough training data – and rest assured, Nvidia has plenty of training data to work with – you can achieve high-quality, scalable optical flow. execute at runtime.

This leads to improved frame rate even in CPU-limited games. Oversampling only applies to your resolution, which depends almost exclusively on your GPU. With a new image that bypasses CPU processing, DLSS 3 can double frame rates in games even if you have a full CPU bottleneck. It’s impressive and currently only possible with AI.

Why FSR 2.0 Can’t Catch Up (Yet)

Comparison of FSR and DLSS image quality in God of War.

AMD really did the impossible with FSR 2.0. It looks fantastic, and the fact that it’s brand independent is even better. I’ve been willing to ditch DLSS for FSR 2.0 since I first saw it in Death Loop. But while I like FSR 2.0 and think it’s a great piece of kit from AMD, it’s not going to catch up with DLSS 3 anytime soon.

To begin with, developing an algorithm that can track every pixel between frames without artifacts is quite difficult, especially in a 3D environment with fine and dense details (Cyberpunk 2077 is a great example). It is possible, but difficult. The bigger issue, however, is how inflated this algorithm should be. Tracking every pixel in 3D space, performing the optical flow calculation, generating an image, and cleaning up any mishaps that occur along the way – that’s a lot to ask.

Getting this to work while a game is running while still delivering frame rate boost to the level of FSR 2.0 or DLSS is even more to ask for. Nvidia, even with dedicated processors and a trained model, still has to use Reflex to compensate for the higher latency imposed by optical flow. Without this hardware or software, FSR would probably trade too much latency to generate frames.

I have no doubt that AMD and other developers will eventually get there – or find some other way around the problem – but it could take a few years. It’s hard to say now.

What’s easy to say is that DLSS 3 sounds very exciting. Of course, we’ll have to wait for it to be around to validate Nvidia’s performance and see how the image quality holds up. So far, we only have a short video from Digital Foundry showing DLSS 3 footage (above), which I highly recommend watching until we see more third-party testing. From our current vantage point, however, DLSS 3 certainly looks promising.

This article is part of ReSpec – an ongoing bi-weekly column that features in-depth discussions, tips and reports on the technology behind PC gaming.

Editors’ Recommendations






#Nvidia #DLSS #works #FSR #catch #Digital #trends

Leave a Comment

Your email address will not be published.