When I started writing Cloud Racer, one of my main goals was to implement a convincing ambient lighting solution. Ideally, I wanted this lighting to not require any (or very much) pre-processing, but to still have a smooth “global illumination” look to it. A fairly big part of the game is the Track Editor, which allows the user to create their own tracks by piecing together small chunks of geometry – cubes, ramps, tunnels, etc. I wanted the lighting to update on-the-fly as the user places these building blocks.
I looked into various approaches in order to achieve this. Screen-space Ambient Occlusion (SSAO) is the obvious choice, and is widely used in commercial titles these days. However, this only really solves the problem of small-scale occlusion, due to it operating on a small area of the screen – fine details are darkened convincingly, but larger-scale occlusion doesn’t occur. For example, in my game I wanted to be able to place an arrangement of cubes to form a tunnel, and to have a clearly darkened area inside this tunnel from their combined occlusion of light. SSAO wasn’t going to help me with that.
The next item on my list was radiosity. Despite the usual computational time associated with this, I figured that a low-resolution map (for my fairly simply geometry) would be quite fast to generate if it were GPU-accelerated. Additionally, I planned to only regenerate the affected parts of the track when a new block was placed. Unfortunately, the timings were just taking too long – for even a small track, my initial tests took minutes to generate, and still didn’t look good. This wasn’t the solution I was looking for.
I then came across this paper (by Kontkanen & Laine), and this presentation by Nathan Reed, which introduce the idea of Ambient Occlusion Fields. These suited my requirements fairly well: once a mesh has been pre-processed, it can be placed into any environment without any extra work required. In both of these papers, the AO Fields are applied as a post-process in a deferred setup, much like a dynamic light would be. Unfortunately every single one of my blocks would require its own AO field, meaning that the fill-rate would be massive if every one needed a pass to cast occlusion on the objects around it.
I decided to try to bake the AO Field results into textures. The run-time cost would then just be the same as using a lightmap, and the process of generating the textures should be fast – they can be rendered in real-time as a post-process, so baking them to a texture should be almost instant.
Generating the AO fields
Although the process of applying AO fields is very quick, generating them can take quite a while. Because of this, I wrote a command-line pre-processor application that runs every time I check in a new model.
The algorithm for generating the AO fields themselves is quite simple: create a bounding box that encompasses the model (and extends out a bit), divide your box into a low-resolution grid of cells, render a cubemap from each cell (render in all 6 directions), and calculate the proportion of pixels (and average direction of those pixels) that are occluded by the model from each cell.
From this data, you can estimate an occlusion cone at each cell – in other words, how much the model occludes ambient light from this position. This data is then stored as a volume texture (in Kontkanen/Laine’s paper, they store the data in a cube-map instead).
It’s a bit difficult to see, but this is my visualisation of the AO Field for the Stanford Bunny. Each line’s direction indicates the AO cone direction, and the line length is proportional to the cone size. The lines get smaller further away from the bunny because less of it is visible.
The volume texture can then be sampled in a shader, in order to calculate the amount of occlusion from this object at any point around it. Using bilinear filtering, the direction & cone size are smoothly interpolated between the voxels.
Mapping to textures
I needed a unique UV mapping for any mesh that I wanted to apply AO to, as any mesh which shared UV-space between multiple polygons would obviously result in the occlusion being the same over those surfaces. I wanted to make my art pipeline as easy as possible to use, so requiring every model to be exported with a unique UV mapping (in the second UV channel) seemed cumbersome. As I was already pre-processing the AO fields, I added automatic generation of unique UVs to my pre-processor.
In order to do this, I used D3DX’s UV atlas generator (see D3DXUVAtlasCreate). It’s quite a powerful tool – it allows you to specify things like gutter size (pixel space between charts in the UV atlas), adjacency information for triangles (to make better use of UV-space), etc.
Each time you place a new “block” in the Track Editor, any nearby object with an AmbientOcclusionComponent has its AO regenerated, taking into account the newly-placed block’s AO field. Using separate AO result textures for every block is obviously not ideal, so I decided to dynamically create a texture atlas for the blocks in the scene.
In the shader that generates the AO for a block, I pass in the scale & offset in order to render it to the correct position in the atlas. The texture atlas itself is a 2048×2048 texture, and each block’s occlusion-map takes up 256×256 of that. Because I’m only storing grey-scale results, I use each of the 4 texture channels to increase the number of blocks that fit into each atlas: 64 blocks per channel with 4 channels means that I can store 256 blocks per atlas. Each track should consist of several hundred blocks, so even though I haven’t built any full, final tracks yet, I’m fairly confident that only a few atlases will be needed.
In order to generate the combined AO for multiple fields, I render the mesh multiple times, using the (unique) UV coordinates as vertex positions. I start with a white texture and each pass is subtractive, resulting in the texture becoming darker as more nearby objects contribute their occlusion.
Unfortunately, at this point I realized that the textures that I was generating through my AO fields weren’t entirely filling the UV area for each triangle. Despite taking into account the half-pixel offset required when rendering to texture (see DirectX docs here), there were noticeable white seams at the edges:
I’d expected to need to apply gutters in order to use bilinear filtering and mip-maps, but these seams were appearing even when using point-filtering. In addition, when I applied gutters by using D3DX’s gutter helper (see D3DXCreateTextureGutterHelper & ApplyGuttersTex), the results looked wrong anyway!
After a lot of investigating, I realized that the rasterisation rules for rendering triangles (at least in D3D9) aren’t the same as the texture sampling rules. When D3D9 rasterises a triangle, only pixels that have their centres covered by the triangle are “on”. When sampling textures, any texel that is touched by the polygon will affect its colour. As a result, any texture that is produced by rasterising using UV coordinates has a problem like this: (screenshot of two triangles taken from SoftImage, which I was using to debug the problem)
I then found that if you render in wireframe mode, the seam pixels do get rasterised:
If you render first in solid and then in wireframe, you end up with the correct result:
I then needed to use the stencil buffer to ensure that each pixel was only being written once per pass. This is because I’m using blending to accumulate the AO results, and without the stencil buffer I would end up with pixels being written to by both the solid and the wireframe renders.
I’m still not sure of the details behind this solid-then-wireframe approach, but my solution seems to work consistently. I couldn’t find any information on it on the internet, so hopefully this helps someone out there! After I’d solved this mystery, applying gutters worked too, as the correct pixels were now being “extruded” away from the actual areas of UV.
In-game shot of the baked textures correctly covering the surfaces:
I then realized that there was an issue with the AO fields’ cells that sat between the inside and the outside of a mesh. If the centre of the cell was inside (but the cell extended outside of the mesh), the size of the occlusion cone would be zero (as the cubemap would be inside the geometry and the AO field generation doesn’t render back-faces). This results in white “halo” edges when two cubes form a corner, for example:
Rendering back-faces when generating the field doesn’t help matters, as cells inside the geometry then get a “cone” which fills the entire hemisphere and which points in a meaningless direction (as every direction is occluded).
In order to combat this, I check in the field generation if a cell has zero occlusion, and then average up the results of any neighboring cells which do have a valid occlusion result, replacing the edge value with the average. The animated gif below shows before & after pictures for this solution:
Internal Ambient Occlusion
AO fields are typically too low-resolution to apply to the object that they were generated for – they’re better for casting low-frequency occlusion on objects around them. One advantage of baking the AO fields to texture (instead of applying them in a post-process) is that you can easily avoid them affecting the object that they were generated for. On the other hand, when applying them as a deferred post-process you’re just working with depth & normal buffers, and every AO field affects every object – including its own one. Nate Reed’s presentation talks about avoiding these self-occlusion artifacts by biasing the sample points away from surface normals, but baking the results avoids the problem entirely.
However, you do still want some internal occlusion, as non-convex objects will look wrong otherwise. Because of this, I added another phase to my mesh pre-processor which generates internal ambient occlusion. I do this by accumulating the results of many thousands of “shadow maps” into a 32-bit floating point texture.
Each iteration, a random point on a sphere outside of the mesh’s bounding box is chosen, and the mesh is rendered orthographically from that point, storing the depth for each pixel in a texture. I then set the render-target to my accumulation texture, and render with vertex positions at the UV coordinates of each vertex. If the projected depth of each pixel is more than the stored depth, that pixel is marked black; otherwise it’s white. When you divide the final accumulated value by the number of iterations, you get a reasonable approximation of the percentage of directions from which that point was visible, when “looking” from the outside of the model.
This texture is stored with the model and is additively combined with the final AO field result whenever a new block is placed. You can see in the image below how the interior of the “tunnel” object is darkened by the surrounding geometry. The advantage of this approach is that it is very fast – you can calculate a decent AO map in a few seconds. Unfortunately, the results aren’t perfect (particularly at edges between polygons), and I’m considering switching over to using raytracing because of this.
There are some elements in the track which are dynamic – for example, I’m currently testing with rotating “rings” that you can pick up along the way. The AO fields for these can be applied just as they were in the original paper, as a post-process, much how dynamic lights are drawn in deferred rendering.
I’ve ended up with a system which produces decent AO results and has hardly any effect on run-time speed. It also generates ambient occlusion at interactive rates when placing objects in the Track Editor.
I probably wouldn’t have set off on this project if I’d known it would take so long, but I am fairly pleased with the final results.
Here’s a video showing me testing the ambient occlusion with lots of cubes:
And in case you’re interested, here’s a video showing some *very* early gameplay – jumping around the track: