Z pre-pass
In the rendering Process, the first pass render to a depth buffer to get the front layer of depth. Next, we use this depth layer to cull the objects behind where a lot of draws are omitted.
This technique works well when we render transparent objects. The disordered internal structure of the transparent objects will not appear, due to the depth culling.
Actually, the efficiency of z pre-pass seems not quite optimistic.
http://casual-effects.blogspot.hk/2013/08/z-prepass-considered-irrelevant.html
http://www.gamedev.net/topic/641257-depth-pre-pass-worth-it/
These two authors have test the performance with or without z pre-pass. The conclusion is that there was no efficiency improvement.
The saving cost of overshadeing in second pass pays the cost of transformation, tessellation and rasterizer setup in first pass.
I think this may be the right reason AC2 cut z pre-pass off, but it will lead to transparent objects rendering order issues.
front-to-back
Render the opaque objects from front to back, so that the objects obscured will be culled from the front surface by depth test. The efficiency will be improved.
Pack multiple batches together
Submit draw calls of the same render sates one time instead of small batchs many times.
https://www.nvidia.com/docs/IO/8228/BatchBatchBatch.pdf
(This document from Nvidia has detailed explanation of batch. It is a CPU bottleneck, not GPU. Show me many new opinions even conflict with my ideas before. Now I am not very confidence with my opinion about pack batches. )
But there is a contradiction between front-to-back and pack batches of the same rendering state. We want to render some grass located everywhere around the scene, for example. If render them from front to back strictly,
will lead to switch render states repeatedly and you could not merge batches.
In response to this question, Zhangxiaoyu and Chenzhe discussed the idea that if you do a z pre-pass, you do not need front-to-back, so you could pack batches.
We all agreed. But then I read those above two articles aware of the following questions:
1. At the first pass of z pre-pass, if we use front-to-back, efficiency improved. This is have been neglecting during the discussion. Z pre-pass and front-to-back are not mutually exclusive.
2.The discussion ignores the cost of z pre-pass in the first pass from vertex shader to rasterize. Although there won’t be any ps, go to the rasterization cost a lot, from the two tests above.
In summary, z pre-pass plus pack batches is not optimistic. I will test by myself after my deferred and forward demos established to get a further insight.
Quoted from Morgan McGuire(the author of G3D):
In other words, the z-prepass may be irrelevant in modern rendering systems that submit many draw calls for well-sorted objects,
and is potentially harmful as tessellation (and thus rasterizer setup) and skinning workloads increase.