From my understanding of Queue Ahead, it yields back control to the CPU before the compositor has completed its work and sent the frame to the display. This can shove off a few ms (how much are we talking about here?) from the frame budget and may make all the difference between making it to v-sync or not.
However one of the first thing usually done in the frame lifetime is likely to be the grabbing of the predicted eye pose. Naturally the later the sensor is read, the better the prediction will be. So, in the case of a game loop that has room to spare (and is stable), wouldn't it be better to disable Queue Ahead? I think it should give a slightly better fidelity to the head tracking.
I haven't tested this (will do tonight) and I don't expect it to be consciously noticeable anyway. The thing I'm currently working on takes between 1.8 and 2.9 ms per eye at 2x pixel density on a GTX 660, so I have good hope that the frame budget will be met on reference hardware. Am I right to think I should disable Queue Ahead by default and maybe leave it as a quality/speed tradeoff option for users?
Hmm, in hindsight it shouldn't matter much because timewarp is going to remap the frame to a post render pose anyway…
To let the cpu enqueue sooner does not enforce you to a larger delta time between a sensors read and image presentation.
Here is how you could proceed for example :
At frame start, evaluate the hmd positioning. Perform cpu culling with an extended marging, like 5-10% larger vision cone starting a little behind the obtain position Start rendering stuff that are view independant, shadowmaps, gpu simulation. Render the view in a deferred context ( DX11 deferred context are a little lame, DX12 is far better for that, but that is an example ). Then, when you are ready to kick the frame for real, re-evaluate the sensors, update constant buffers that contain view/projection information and let run the deferred command buffer that you filled with up to date positioning.
With DX12, we could fence the command queues and insert a wait to update the information right before a command list is about to kick, and when one day Oculus will release a dx12 sdk, i hope they will have tools to directly update a gpu buffer without to have to include a sync point with the cpu.
The sensor-read for timewarp is also impacted by queue ahead so I think there can still be value in disabling it when not necessary. Disabling it reduced latencies values by almost 3ms according to the perf HUD monitoring.