-
-
Notifications
You must be signed in to change notification settings - Fork 22k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve SSAO #101961
base: master
Are you sure you want to change the base?
Improve SSAO #101961
Conversation
Btw, I noticed a similar artifact with SSIL - once we find an acceptable solution for SSAO I can look into how much of it can be ported over. |
Update: I seem to have found a mistake in Godot's implementation of Intel's ASSAO! Comparing these two lines, Godot uses the half screen pixel size whereas the reference implementation uses the full resolution pixel size:
https://github.com/GameTechDev/ASSAO/blob/c21d57a95b43c700ee01a2eb18f5589d7becefb2/Projects/ASSAO/ASSAO/ASSAO.hlsl#L648 Fixing this has a drastic impact on the image. Not necessarily an aesthetic improvement and it doesn't remove the artifact I'm mostly worried about, but perhaps it's more physically correct this way? Was there an intentional reason to deviate from the reference here, or am I misunderstanding the meaning of the variables? Before and after fixing this (by using |
7437035
to
5ef661c
Compare
Glad to hear that! :D I will continue looking for avenues for improvement. Another one I found earlier is that the sample radius does not seem to vary based on distance to the fragment as far as I can tell. There is a line for that in the reference implementation but it is commented out for "performance reasons" - I'll try to measure performance and compare the outcomes later so we can decide if it's worth the cost. Also, there currently are artifacts near the screen border because the depth texture ends there. There are workarounds in the code to make it less noticeable, but ideally the depth would be rendered at a greater resolution beyond the main viewport so those regions can be shaded correctly. How do we feel about doing that? Perhaps tied to the High or Ultra quality setting? Not my main priority at the moment, but I figure if we change the look of SSAO we should do it in a single PR if possible. |
This is good opportunity to do the same as #97428 when TAA or FSR2 is enabled. Offseting
See godotengine/godot-proposals#3854 which covers precisely this. I feel it should be a separate Viewport property (and project setting), as rendering outside the viewport will benefit all screen-space effects, not just SSAO. This would cover all buffers (color/depth/normal/roughness), not just depth, so that effects like SSR can benefit too. |
Quality comparison on https://github.com/Jamsers/Bistro-Demo-Tweaked using the SSAO debug draw mode. All other settings are left to their default values. Very Low
Low
Medium (default)
High
Ultra (Custom with default settings)
|
Thanks for taking these screenshots! Just note that at the moment it's not a completely fair comparison yet since my version has a significant performance impact too, hoping to improve on that soon. (Although the majority of the difference in quality might be due to the pixel_size fix which is just a single additional multiplication by 2 per fragment) |
I have noticed artifacts in the current SSAO implementation which for me at least are too distracting during gameplay, especially when moving the camera because the pattern moves with it in a noticeably weird way. I'm currently trying to improve the quality without increasing performance costs too much.
Here are screenshots of the artifact, as viewed through the SSAO debug mode: a strange repeating pattern of the shaded region on the wall. Taken at the default SSAO configuration with medium quality, a radius of 1.0, and rendered at full resolution instead of half. The radius is too high, but I deliberately stuck with it so the artifact is more visible and I can track progress on removing it more easily. Note that a blank wall with a foreground object and high radius is somewhat of a worst case scenario for SSAO.
With blur passes disabled:

With the default two blur passes:

From what I could figure out the main problem is not the number of samples taken but the randomness of those samples. The shader contains a fixed array of 32 offset vectors which are used as sample locations, and twenty "random" rotation matrices which are applied to the sample pattern for more diverse samples.
The image is "de-interleaved" into four half-resolution images and the shader runs on each of them before they are stitched back together and subsequently blurred (this apparently helps with cache hits when sampling the depth texture). Each of those four "passes" has a set of five rotation matrices where the angle varies within a certain range. The matrices are generated on the CPU side in
ss_effects.cpp
and sent to the shader in a uniform.My first observation was that the rotation matrix does not much make sense to me - in abbreviated form, where
sa
is the sine of a randomly chosen angle andca
the cosine, the code is basically:But from my understanding of math and GLSL the correct way to set the values of a rotation matrix in column-major format is:
The second major thing I noticed is that we really do not need to construct the whole matrix on the CPU - we can send pairs of sines and cosines and then construct the matrix with
mat2(ca, sa, -sa, ca)
in the shader. I think this is pretty cheap, and this way we can send twice as many rotation matrices, creating more variation in the samples.However, even addressing those points I couldn't get rid of the artifact. So as a proof of concept I tried doing something that is not free in terms of performance - generating a random angle and the entire rotation matrix in the shader. This comes at the cost of two invocations of a
quick_hash
function (also used for soft shadows), and one invocation each ofsin
andcos
per fragment (not per sample).For some more cheap additional variation, I start iterating through the fixed sample array from a different index depending on the current pass.
This is the current status of my experiments. With blur passes disabled:

With the default two blur passes:

The performance impact of the current state of the PR is 0.2 additional ms on the GPU at 1080p, for a total of 0.6 ms. I haven't tested it yet but I think this cost should only scale with resolution, not the quality level / sample count used.
The artifact is mostly gone now. However, the blur filter has a harder time smoothing over the resulting noise. I noticed the blur shader only takes directly neighboring pixels into account - I think much better results could be achieved if we use a higher blur radius (the cost of that might be partially regained by not having to use multiple blur passes). Not completely sure yet how it would be done while retaining edge-awareness, but my next step would be to look into that. Another approach worth investigating might be to stick to generating rotation matrices on the CPU but just quadruple the amount (for twice the memory, since the current version has redundant data).
If anyone with more of a clue about SSAO has any other suggestions for how to proceed please let me know! I'm not really sure if I'm going in the right direction or missing anything obvious.
Further reading