By Kyle Wilson
Sunday, August 31, 2003
I was fortunate enough at the end of June to have the chance to attend this year's SIGGRAPH conference in San Diego. (Day 1, my current employer, is excellent about sending employees to conferences and otherwise giving them the chance to acquire new skills.)
Between weddings and conferences, this was the fourth time I've been out to California this year, though it's the first time I've been in southern CA since GDC was at Long Beach, way back in 1998 when they were still calling it CGDC. But San Diego was like California always seems to be, perfect and sunny and beautiful. There was a little more sun and a few more palm trees than up in the Bay area, but not much difference to someone from the East Coast who's used to places with real weather. I love visiting California, but I think the sameness of it all would get to me after a while.
I got in on Saturday night, with the show due to start, oddly, on Sunday morning. On the ride from the airport to my hotel, I realized something that explains every cab ride I've ever taken: Cab drivers get paid by the mile, and the more miles they can cover per minute, the more money they make.
I stayed in the Clarion Bay View Hotel or the Clarion Hotel Bay View -- the words seemed to reorder depending on where they were written. The place looked like it had seen a lot of traffic over the years, but my room was comfortable enough, and it was near the convention center. The rooftop hot tub was an especial perk, and served me well as my feet wore down throughout the week, hiking from talk to talk.
I started out Sunday morning attending the course on Light and Color in the Outdoors. First, Simon Premoze explained the physical theory behind light scattering and absorption as sunlight passes through the atmosphere. When a ray of sunlight encounters a particle, it can either be transmitted, absorbed, or scattered off in another direction. If you can correctly model this behavior, then you'll get blue skies, red sunsets, and other physically correct behavior that all arise straight out of your physical model.
For a practical demonstration of this, Premoze was followed by Naty Hoffman and A. J. Preetham talking about a real-time approximation of physically accurate light scattering that they've implemented. They wrote a great article on the subject back in August of last year for Game Developer. Variations on their work are available in a paper they have up at ATI and in the course notes for their SIGGRAPH talk. Before you read any of that, though, you really should download the demo that accompanied their Game Developer article. In this case a picture is worth significantly more than a thousand words.
Light scattering for outdoor lighting really does look fantastic, but it probably doesn't make any sense for us to be using it in our current game. Terrain is by nature diffusely lit, and is always going to look the same for a given sun position and atmospheric state. So unless you have varying time of day or weather, you might as well just bake all that lighting information into your terrain lightmap. Furthermore, light scattering precludes the use of fog in traditional CG fashion, to hide everything in the distance. So unless you're working in a large virtual world with dynamic time-of-day and unlimited view distances, you're probably best off with other solutions.
Simon Premoze then returned to talk about night sky illumination. This is largely a tone mapping problem, tone mapping being one of the current hot topics in CG. Tone mapping springs out of high dynamic range (HDR) lighting research. The human eye can process a far wider range of visible light, in terms of color and especially in terms of luminance, than a normal computer monitor can represent. Tone mapping is the process of transforming an HDR image into renderable colors.
In the case of a night sky scene, the luminances involved are scotopic. That is, they're dim enough that you see with retinal rods instead of cones. You lose color information. You lose spatial detail.
Premoze approaches the problem like he's doing a day-for-night film shoot. He starts with a fully lit scene. To suggest darkness in hue, he blue shifts the whole scene like James Cameron is so fond of doing. Then, interestingly, he simulates the scotopic loss of detail by applying a spatial blur. I wouldn't have thought of it, but it looked pretty good.
In the afternoon I skipped over to the Real-Time Shading course. Randi Rost from 3D Labs talked a bit about how graphics processors are becoming more and more like general-purpose CPUs specially optimized for vector processing. At some point he mentioned the Open Scene Graph project, which I'll mention again here since scene graph structure is a subject of interest to me. It looks sort of like an open-source Performer.
He was followed by Jason Mitchell from ATI, who was showing off a really-cool real-time implementation of the RenderMan uberlight shader. Unfortunately, the demo doesn't seem to be available anywhere on the web, though the PowerPoint slides for the talk have some pictures. But all of the pictures at the uberlight site were things he was doing in real-time. It was very impressive to behold.
Kurt Akeley from Nvidia came next plugging Nvidia's Cg graphics language. Cg is in the unfortunate position of competing with Direct3D's HLSL. As far as I can tell, Cg looks like Nvidia's attempt to make the next GLIDE, and we all know where that ended up. That said, the Cg toolkit being show actually looked pretty interesting. ATI's RenderMonkey also seems to provide similar features, though I haven't worked with either. But there seems to be a big push among the graphics developers to provide tools that interface with shaders, so that programmers can write custom shader materials and automatically have interfaces generated for them that allow artists to set tunable shader parameters in Max or Maya.
Sunday ended with an evening special session of postmortems from senior developers on Neverwinter Nights, Splinter Cell and Sly Cooper. I'm a total postmortem whore, so I found it all great fun. John Bible from Bioware had interesting things to say about their parallel development of tools and game. He said that their data formats and interfaces were frozen relatively early on in Neverwinter's five-year development cycle, and that technology improvements after that point could only be made if they didn't require sweeping data changes, because they had a lot of data content. Also, interestingly, they wanted to use Borland C++ Builder for their tools, so they ended up using two different compilers and having compatibility issues between them even though they were developing for a single platform.
Dany Lepage, the 3D programmer for Splinter Cell, said that the game's greatest strength was that it shipped on time. At E3 of 2002, Doom 3 was voted best game. But Splinter Cell shipped that fall, looking pretty good, and Doom 3, though it'll probably look great, won't be out until sometime next year. He said some stuff about Splinter Cell's shadow technology that I've suspected for a while. They use a combination of shadow techniques -- depth buffer shadows, projected shadows, and lightmaps -- depending on the situation. Not all lights cast shadows, and all the levels have been carefully constructed by the designers to do what they can do where they can afford to do it, and hide all associated artifacts. He said, referring to the Neverwinter Nights guys, that there was no way the Splinter Cell team could have shipped tools with Splinter Cell and expected end users to make their own content.
Dev Madan from Sucker Punch talked about art direction. That's not my field, so I won't offer any commentary on what little I remember.
HDR lighting is generally about capturing an environment map describing a scene and using it to light a virtual model. This style of lighting is very popular in the movie effects crowd because it lets them integrate virtual actors with real ones and keep the lighting consistent. But to capture the environment, it's not enough just to take snapshots around the scene and scan them in. You need to capture the full range of luminance in the scene, and you need to capture lighting from all directions. You accomplish the former by anchoring your camera and snapping six or eight images at different shutter speeds to capture different ranges of brightness. You capture the entire scene by photographing a mirrored sphere placed into it. There are programs available to composite these images and correct for stretching, camera properties, imperfect reflectivity of the sphere, etc. Debevec plugged his own HDRShop for this purpose.
There are now a variety of HDR formats to store these images in, since the usual 32-bit RGBA format clearly isn't up to the task. At the high end, you store a full IEEE floating-point value for each color channel.
For the movie people, that's the end of the story. Once they've captured an environment map, they can plug it into their favorite ray tracer or global radiosity solution and use it to light their scenes. For those of us doing real-time work, though, it's not quite so simple.
That's where Peter-Pike Sloan's work on Precomputed Radiance Transfer (PRT) comes in. Sloan presented the original paper on PRT at SIGGRAPH last year. This year there was a whole session on the subject, and he was co-author of two of the papers.
The original PRT paper introduced the computer graphics community to spherical harmonics. Spherical harmonics (SH) are a set of basis functions, like a Fourier series, but over the surface of a sphere. You can think of a set of spherical harmonic coefficients as a cheap representation of a cubic environment map. If you have an environment map without much high-frequency information on it, you can represent it in 16 or 25 coefficients.
Precomputed radiance transfer uses spherical harmonics to light a model with an environment box in real time, modeling interreflections and occlusions from one part of the object to another. For each point being lit, Sloan computes a set of SH coefficients representing diffuse reflection from light in a given direction. He augments this with another set of SH coefficients representing self-occlusion and yet another representing reflected radiance from every direction. The last has to be expensively re-computed any time the light environment moves with respect to the object being lit.
Despite all the fanfare it's being given, precomputed radiance transfer may never be useful for games. It takes a great deal of memory to store all those coefficients and modern hardware isn't very well suited to accelerating SH lighting. These will eventually cease to be problems. A greater limitation is that lighting information needs to be low-frequency (small, crisp area lights can only be represented with large numbers of SH coefficients) and all lights are treated as being infinitely distant. For any scene in which you have large models surrounded by small lights, precomputed radiance transfer is not a useful lighting model. Finally, it's impossible to precompute radiance transfer for dynamic models like characters in games.
This year's SIGGRAPH introduced three PRT papers, none of which addressed these limitations. The first, Bi-Scale Radiance Transfer, basically extended PRT by adding something like bump-mapping. This was one of Peter-Pike Sloan's two papers, though it was presented by one of his co-authors. There were a great many pictures of a little rubber ducky model covered in tiny dents.
Ren Ng from Stanford then presented All-Frequency Shadows. He presented a variation on precomputed radiance transfer which doesn't use spherical harmonics. Instead, it models lighting with an environment map as a massive matrix operation, where one large column vector represents the luminance of the environment map (one value per texel) and an enormous transport matrix represents the contribution to final diffuse color of each texel in the environment map to emitted diffuse light at a point on the model. (So the matrix is number of pixels in the environment map * number of points being lit in size.) Ng then wavelet compresses the matrix and the light vector and multiplies the resulting sparse matrices.
This is insane. Ng was a fantastic speaker, and I'm impressed by the sheer ballsiness of his technique, but there is no way that this technique can use graphics hardware now or in the foreseeable future. On top of that, it adds another limitation to those of basic PRT: either the light source or the viewer has to be fixed.
Finally, Peter-Pike Sloan presented another PRT enhancement, Clustered Principal Components. This is a data compression/speed-up for PRT that enables increased lighting frequency. I'd like to claim that I could follow all the math, but I'd be lying.
Also on Monday, before the PRT talks, I made it to the Game Developers Birds of a Feather meeting. Of which it can only be said, too many game developers, too little space. Call it 300 geeks in a room designed for a third that number. Maybe half actual game developers, half wannabes. I got accosted early on by a guy who's putting together a college game development curriculum and wanted to know what I thought. I was the wrong guy to ask, though, since I think those programs are nonsense, and think that you're better off with a good foundation in software engineering or computer art than trying to learn all about "game development". So I wasn't much help to him. I then proceeded to embarrass myself talking Jason Della Rocca by confusing the IGDA and the IDSA. Finally I just retreated and spent the rest of the hour racking my brain trying to come up with interesting things to say to a cute British girl. "So, um, what's the time difference between California and England?"
On Tuesday I caught some of the character animation talks. Okan Arikan presented interesting work on Motion Synthesis from Annotations, piecing together a continuous sequence of animation from a large database of available animation samples to match specified user constraints. So the user would apply to a timeline states like running, jumping, waving, and animations would be synthesized to satisfy them. I'm wary about the immediate usefulness of his work for what we're doing in the game industry, since it sounds like his sample animation database would have to be large and searching it would take time that we can't really spare for the task in a game setting. (Have a look at the sample video at Arikan's web page and see how long it takes for his animation searches to converge.) I suspect that the future of game animation is probably going to be something like this, though, that we'll want to specify behavior and constraints at a high level of abstraction and then rely on some kind of animation synthesis to produce final visible behavior.
After an afternoon trip to see the showings at the electronic theater, I attended the afternoon session on shadows. In Interactive Shadow Generation, Naga Govindaraju showed a scheme that involved using three PCs with Geforce 4s. Golly. He uses the first two machines to generate potentially visible sets of geometry from the eye point and light point, then feeds these into the third machine, which determines which geometry to cast shadow maps from and which to cast volume shadows from, and renders the final scene.
Like the work I remember going on in the department when I was at UNC, this all seems targeted at extraordinarily large, dense and complex data sets, high detail models of ships and nuclear reactors with lots of pipes, that sort of thing. Under those circumstances, it's worth devoting a lot of CPU power to occlusion culling, LOD processing, identifying fully-shadowed objects for cheap rendering, and so forth, because the scene detail is so great that it'll kill your GPU otherwise.
In the game industry, though, we make our own data, and we tend to build our scenes so that the GPU can handle them. We need to, because we don't have a lot of CPU time to spare for all that preprocessing. We have other things we need to do on the CPU, like AI, physics, trigger logic, gameplay. I think this difference in perspective is a big disconnect between academia and our community.
Ulf Assarsson then presented A Geometry-Based Soft Shadow Volume Algorithm. This offered a refreshing break from all the hard-edged stencil and depth-buffer shadows we're used to seeing in computer graphics. Unfortunately, it's still a little pricy with modern hardware. In its most accelerated form, the algorithm involves using a 32x32x32x32 texture for every light source. That 4D texture is used to map occlusion based on the endpoints of a clipped silhouette edge crossing the light. Look-ups into this texture are then projected onto the scene from each edge, at non-trivial cost. Further limitations are that the algorithm assumes that object silhouettes are constant when viewed from any point on an area light source and that it doesn't handle overlap between different occluders. Still, the results look great, and it's nice to finally see area light sources in real time. With luck future hardware will better support this kind of effect.
Finally, Pradeep Sen talked about Shadow Silhouette Maps. This one was really ingenious. The greatest compliment you can pay to any graphics algorithm is to say that it's a nifty trick, and shadow silhouette maps are a nifty trick. Sen augments normal depth-buffer shadows with silhouette maps, which store in each texel along the edge of the shadow volume an intra-texel point along the silhouette boundary, partitioning the texel into lit and unlit regions. With an additional lookup by the pixel shader applying the depth buffer shadow, this corrects the blocky texture projection artifacts common to depth buffers.
Unfortunately, both the generation of the silhouette map and the final render are prohibitively expensive right now, the former requiring a 60 instruction pixel shader and the latter a 46 instruction shader. As graphics hardware becomes complex enough to support conditionals in pixel shaders, shadow silhouette maps will no doubt become common, but for now they're still much to afford.
Tuesday evening I went to the UNC Birds of a Feather meeting and saw the ones who stayed and got PhDs back when I gave up academia for the seedy world of game development. The guys I ended up having the most to talk to about weren't ones I'd gone to school with at all. They were grads who'd preceded and followed me, but who'd ended up in the same line of work.
I spent Wednesday morning watching the Effects Omelette talks. By this point I was spending less time going to the talks that I felt like I was supposed to be in and enjoying myself more. The Omelette stuff wasn't anything game related, but was a fascinating look into the current state of visual effects in the movie business. I hadn't realized how adept the movie crowd was at integrating real and virtual scenes, and I hadn't realized how much was being done with virtual actors. The effects guys are doing a lot of capturing real light with mirrored spheres, 3D scanning real actors, and compositing the results. In the afternoon I went to The Matrix Revealed. I'll be honest, I was underwhelmed by the completely CG scenes in The Matrix Reloaded. They seemed a little flat in the theater, and I never had any trouble telling real Keanu from virtual Keanu.
But in their talk, they showed the first virtual actors I've ever seen who were completely indistinguishable from the real thing, placed in a real scene, lit with real light, animated with real facial expressions. I'm still baffled as to why their demos looked so stunningly real while the scenes in the movie looked so recognizably CG. I think part of it may be a matter of resolution -- we were watching a projected TV image in the talk, which comes nowhere close to film resolution -- and part of it may be a matter of animation. The motion captured actors may have been spot on, but I just don't think Keanu's trench coat swung like real cloth.
Back in talks applicable to game development, I watched the LOD tech sketches. J. Andreas Baerentzen combined CG and pointillism with a new LOD scheme that approximated an object with a random subset of a point cloud representation of itself. Kind of cool. Louis Borgeat showed A Fast Hybrid Geomorphiing LOD Scheme, which was another of those academic LOD solutions that only seems really applicable for models that start out drastically overtessellated. Jon Cohen showed off GLOD, an OpenGL-based interface for level of detail. And Maryann Simmons from SGI talked about managing shader LOD for smooth transitions, which would have been more interesting to me if it hadn't been so tied to the use of SGI hardware.
I caught the "and Simplification" part of the Modeling and Simplification session. Andrew Wilson (who has the same last name as me and the same advisor at UNC, but no other relation) presented Simplifying Complex Environments, which is sort of the final evolution of Quicktime VR. The talk was all about choosing best points in a scene from which to capture new environment boxes for optimal coverage, and it was all very well reasoned and completely thought out, but lacked that intuitive leap that would make it a nifty trick.
Xavier Decoret of MIT presented Billboard Clouds for Extreme Model Simplification. Basically, he partitions a model based on how well parts of it can be fit to planes. He renders each subset to a texture, renders the normals of each subset to a corresponding normal map, and then renders billboards with those textures/normal maps. This all achieves drastic polygon reduction pretty robustly and elegantly. Again, though, I'm skeptical about its applicability to game graphics. For us, models tend to start out more modestly tessellated, and the state change costs of switching textures and drawing several billboards are likely to exceed the cost of drawing the original model they're generated from. It's a cool idea, though.
By Thursday I was really starting to feel burnt out on tech, and I'd just discovered the Animation Theater the day before, so I spent a lot of the day catching up on all the animated shorts that I'd missed. I did see two cool tech sketches worth mention, though.
The first was a presentation by Niniane Wang of Microsoft on cloud rendering in Microsoft Flight Simulator: A Century of Flight. Download the video and have a look. The approach the Flight Simulator team took was to have artists sculpt cloud volumes in 3D Studio Max by making cloud-like shapes out of boxes. A custom MaxScript would then populate the boxes with randomly oriented textured sprites. Each sprite would be textured with one of 16 available cloud wisp textures.
To reduce rendering costs to a manageable level, an octagonal "ring" of impostors is placed some kilometers out, far enough away that artifacts from flattening clouds (other planes "popping" from one side to the other, etc.) aren't very noticeable. Pretty white-to-gray cloud shading is added by having artists specify different colors for different altitudes. These can change with time of day for sunset effects, stormy weather, and so forth. Colors are smoothly interpolated with altitude and with time. To animate cloud formation and dissipation, the whole cloud is faded in starting at the center, or faded out starting at the outer edge.
The other interesting sketch was a presentation by David Eberle (not to be confused with David Eberly) on A Procedural Approach to Modeling Impact Damage. Eberle "shatters" models by generating a tetrahedralization of the model and assigning some strength to the edge between any adjacent pair of tetrahedra. When the object receives an impact, he finds all edges within some radius of the impact point, basically reduces their hit points by some amount, then walks the graph to identify disconnected sets of tetrahedra. He applies appropriate angular and linear velocities to the disconnected pieces and voila!, he can shatter objects on impact in real time. It's not physically accurate at all -- for that you'd have to run a finite element analysis offline -- but it's real-time, and it looks just fine. Who knows, maybe we'll smash up buildings this way in some future sequel to MechAssault?
So that was my SIGGRAPH. If you're ever in downtown San Diego, I recommend PJ's Cafe and Coffee Co., where you can get a decent breakfast, a hot drink, and a comfortable chair within a couple of blocks of the convention center. If you're ever at SIGGRAPH, wear comfortable shoes, and carry water and snacks.
I'm Kyle Wilson. I've worked in the game industry since I got out of grad school in 1997. Any opinions expressed herein are in no way representative of those of my employers.