Approximate Ambient OcclusionPosted in Development by brecht
Some well known animation studios are now using approximate global illumination in feature films: Pixar used ambient occlusion in Ratatouille, Dreamworks used single bounce indirect lighting in Shrek 2, ILM used ambient occlusion in Pirates of the Carribean 2 & 3. So I looked for an ambient occlusion method that was reasonably simple to implement, and still fast enough to handle the high poly scenes we are dealing with.
Not considering leaves and fur (those are much too noisy to efficiently compute standard ambient occlusion with), our scenes will still have millions of polygons. Using raytraced ambient occlusion as in Blender now would take a long time to render, especially since for animation the result must be noise free, which would require a large number of samples. Instead I worked on an alternative method based approximating all surfaces as disks from NVidia (Dynamic Ambient Occlusion and Indirect Lighting [pdf]), which was used by ILM to deal with very high poly meshes in Pirates of the Carribean 2 & 3, and is also available in PRMan as “point based occlusion”. An advantage of this approach is that it is inherently noise free, though it is not accurate and it does suffer from other artifacts.
Surface approximated by disks.
The basic idea is quite simple: treat each vertex as a disk, and compute how much each point is occluded by summing together for all the disks how much they occlude that point. Going over all disks in a scene to compute occlusion at one point would be quite slow, so we can cluster together disks into bigger disks, and as we get further from the point we are shading, we use those bigger disks instead. Another thing is that regions can be shadowed too much, since you’re not taking into account if one disk is behind another, shadowing from the same direction twice. In practice it still works quite well even with this approximation, though multiple passes can be used to reduce overshadowing.
In my experience this basic method works fairly well, but it still gave a number of artifacts. Part of those could be reduced by increasing the accuracy when traversing disks in the scene, though that made things quite a bit slower, and still didn’t get rid of some artifacts. Fortunately, a recent method published in the GPU Gems 3 book (chapter High Quality Ambient Occlusion, sorry, no link), solved many of these problems by treating nearby triangles as triangles directly, instead of disks. My implementation uses part of the code as provided by the authors (Jared Hoberock, Yuntao Jia) with the book, thanks!
Spherical harmonics up to the 2nd order, as used in our implementation.
However, traversal of the scene still required fairly high accuracy, because of the way bigger disks are created from smaller disks. If two disks point in quite different directions, the averaged disk can give quite different results than using the two disks individually. The first solution I tried for this was to cluster disk together not only by position, but also by normal. While this worked well to get rid of artifacts, this resulted in traversal time that was perhaps slower than necessary, since disks were not clustered together spatially that well. The second approach I tried now approximates the sum of disks with spherical harmonics, as used in PRMan and explained in the Point-Based Graphics book, chapter 8.4 (sorry, again no direct link). This means we don’t have to cluster by normal anymore, though in the end it seems not much faster, but it does seem to avoid some artifacts at lower accuracy.
So, starting with disks used for everything, finally the implementation uses no disks directly anymore, but triangles and spherical harmonics approximations of disks. Also not that the way this works is similar to the way subsurface scattering works in Blender, with the extra difficulty here that the occlusion is directional, rather than uniformly distributed with a quick falloff. Both approximate far away surfaces by assuming the contribution of many small elements can be replaced by one big element, which in physics is known as the superposition principle. In this case it only holds approximately, but in computer graphics we are allowed to cheat.
Further work that is need is making this work efficiently for fur. The standard trick appears to be to compute ambient occlusion at the surfaces, and use that on the strands. However, computing it for all of the strands in the scene (which can be more than 10 million), is probably still not efficient enough, so a method to extrapolate ambient occlusion results over multiple strands is needed. Another thing is that ambient occlusion could be sped up by some form of irradiance caching (this is also true for the raytraced ambient occlusion), though there is probably not enough time to do a full implementation of that. Instead I’m trying a simple screen space method to upsample the results from fewer pixels.
Here are some simple results with Suzanne, on an Intel Core2Duo 3.0Ghz with 2 threads:
Further work could also make this method usable for a single bounce of indirect lighting, though I don’t think we will be able to use that for Peach. A quick prototype implementation showed that this method could indeed support it fairly well, but performance it is likely not fast enough for Peach.
Currently this new ambient occlusion is not in SVN yet, so you will have to wait a bit to test it, but it will be there soon.