Shader Permutations

Over the past few years its been a lot of fun to watch programmable GPUs develop. The fixed function pipeline that was so important to DX7 and earlier hardware definitely got the job done, but it also limited what we could accomplish when rendering. Shader Model 1.0 capable cards came out and they were basically glorified fixed function parts with too little instructions to really be valuable. Shader Model 2.0 cards really changed the equation and unlocked many of the effects that people are taking for granted today. As Uncle Ben says, with great power comes great responsibility. When it comes to shaders, that responsibility comes in a couple of forms:

  • You own the vertex, pixel fragment, and primitive generation when you render the object. You’ve got to transform the vertex with the view projection matrix. You have to perform skinning. You have to apply texture transforms. You have to implement the lighting equation. You can do anything you want, but you can’t lean on any existing processing pipelines.
  • Limitations in what a given shader program can accomplish. Current shader models allow for branching in the pixel and vertex shader, but to achieve high performance, you want to be careful about when and how these branches execute.

These issues have led to a variety of schemes for building up and authoring new shaders:

  • Re-use existing code. All the shader languages support including other files at this point. Break your shader code into a series of utility functions that get the job and write a main function for each programmable stage that feeds the functions in the right order. There’s nothing wrong with this approach. Frankly, its how most of us write code anyway. You are stuck writing out the correct sequence of function calls and arguments, but if you can limit your artists to a few workable permutations, you’re probaly okay.
  • Use preprocessor defines. This is similar to the prior option, but allows for you to write one mammoth uber-shader that has all that you want, but remove functionality based on what preprocessor states are set. This has an advantage over the prior form in that you can create a pretty complex effect and the permutations fall out of disabling the snippets of code that you don’t care about. Someone once called this “subtractive” shader authoring and I like the term. There are a few things that are harder to do in this approach. For one, the code is harder to read, making maintenance a little difficult. When the shaders get complex, it is harder to know how the various preprocessor symbols will interact. Program inputs and outputs can be a little awkward as well, as some shader languages require sequential packed semantics be used (ie POSITION0, POSITIION1, POSITION2) and holes can lead to compiler crashes or incorrect shader code.
  • Shader stitchers or “additive” systems. This is my favorite option and appears to be en vogue. This is essentially what the shade-tree systems do. You have a series of “functions” or “operators” with inputs and outputs. You wire these inputs and outputs either via artists direction in a visual shader editor or programmatically. A graph traversal gets you essentially a “main” function that calls out to the functional nodes to process data. I wrote a programmatic system for Gambryo called NiStandardMaterial.

No matter what path you take, you’ll still need to have a defined set of features that you intend to support. Here are some of the questions that may help contribute to shader permutation hell:

  • Is my object skinned or unskinned? Do I just pass the vertex data as-is (screen-space quads may want this)?
  • What lights are affecting my object? How many of each type of light are there (point, spot, directional)? Which lights have associated shadow buffers?
  • What lighting model do you use?
  • Does the object need specular lighting?
  • Do you do per-vertex or per-pixel lighting? Are you doing a mix? Are you doing any fancy effects like PRT?
  • What textures can the artist add/remove? What UV sets are they using? Are there any texture transforms per vertex? Are any of the textures using projections? Are any of the textures cube-maps? Do any of the maps share a UV set?
  • Is there a normal map? Is it tangent or object space? Is it encoded as DXN or a fancy DXT-based encoding? Do you have an Normal, Binormal, Tangent frame?
  • Are there vertex colors? How are they applied?
  • Are there any hardware instancing parameters?
  • Is there any fogging?
  • Is there any alpha testing (DX10 hardware makes you write this yourself)?
  • Do I need to write data to any other buffers? Depth? Normals?

Once you’ve built this decision tree, you’re still far from done. You’ll need to have a good mechanism for previewing this in the art pipeline. You’ll need to determine what hardware models you support. SM 2.0 in all its flavors can be a real challenge to develop for due to low instruction counts, branching limitations, and limited registers. When it breaks, you can’t easily programmatically determine why and react. The compilers are also really good at rearranging and deleting dead code, so it is often hard to predict precisely how many instructions will be used. Solving this may involve going different routes for different machine configurations, which makes testing that much harder. You may choose to go multi-pass and draw with different shaders for each pass, but then you’ll need to deal carefully with alpha blending and fog effects.

There’s a lot that I’d change if I were to go back and re-implement the shade-tree system in Gamebryo today, but a few years ago it managed to take all the complexity that was the old Gamebryo fixed-function pipeline and add a few new features on top and be cross-platform with PC, 360, and PS3.


~ by shaunkime on June 25, 2008.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: