cancel
Showing results for 
Search instead for 
Did you mean: 

Free DX 12 Rift Engine Code

lamour42
Expert Protege
Hi,

if you want to write code for the Rift using DirectX 12 you might want to take
a look at the code I provided on GitHub https://github.com/ClemensX/ShadedPath12.git

The sample engine is extremely limited on draw abilities: I can only draw lines!
But it may serve as a learning playground for DirectX 12 and Rift programming.

I find it fascinating how a bunch of simple lines suddenly become great if you can walk around them and view them from any direction when you wear the Rift!

The current state of the code is a first step to porting my older DX 11 engine to DX 12.
If you want you are allowed to use any code you like in your own projects.

I want to express gratitude to galopin, who came up with a detailed 8-step guide on how to combine
DirectX 12 with Oculus SDK rendering. See this thread https://forums.oculus.com/viewtopic.php?f=20&t=25900
When I found out that using oculus API ovr_CreateSwapTextureSetD3D11 on a
D3D11On12Device throws NullPointerExceptions I would have given up if he had not given this advice!

Some features of the code example:

  • Engine / Sample separation. Look at Sample1.cpp to see what you can currently do with this engine and see how it is done.

  • Oculus Rift support (head tracking and rendering). See vr.cpp

  • Post Effect Shader: Copy rendered frame to texture - Rift support is built on top of this feature

  • Use Threads to update GPU data. See LinesEffect::update()

  • Synchronize GPU and CPU via Fences

  • Free float camera - use WASD or arrow keys to navigate. Or just walk/turn/duck if you wear the Rift


Any feedback welcome.
56 REPLIES 56

cybereality
Grand Champion
Wow!

galopin
Heroic Explorer
"cybereality" wrote:
Wow!


All this just means that the GPU is again to be the likely bottleneck of an application and this is great. With a faster cpu side, sending two views instead of one is less an issue, latency is also reduce and it lower the risk to fail a frame in time.

Even if GPUs are damn fast and offloading part of what was done on the CPU prior can be a clear net win, it still means you take a little from your GPU for that, and if you are short on GPU, it still can be better to stay on the CPU side for things like culling and occlusion, or use hybrid approach.

But yes, it is possible to render and entire environment made of hundreds of different meshes with hundreds of materials in a single draw call. It is because you can have all the information you need to pick object positions, textures, material property visible by the gpu directly. You can have thousands of textures bind at the same time and pick whatever the one you need based on a material id retrieve from a mesh instance data or whatever.

lamour42
Expert Protege
Changed the billboard vertex shader so that all the Million of billboards are facing the camera at any time. No change in FPS.
I think it's a nice Rift demo to fly through all the images and see them changing direction towards yourself.

Also, I made a pre-release version on GitHub that includes all the textures needed to run the demo.

cybereality
Grand Champion
That's amazing!

How is framerate and on what machine?

This gives me some ideas. Maybe I will revive my engine project and update to DX12.

lamour42
Expert Protege
"cybereality" wrote:
That's amazing!

How is framerate and on what machine?

This gives me some ideas. Maybe I will revive my engine project and update to DX12.


My machine is this:

  • 6GB NVIDIA GeForce GTX 980 Ti

  • Intel Core i7-6700K

  • 16 GB RAM

  • Oculus Rift DK 2

In single window mode I get around 90 frames per second.

In VR mode with the DK2 I get constant 75 frames per second. Framerate in VR stays at 75 until around 2 Million billboards. For more billboards framerate slowly begins to drop. When I display 4 Million billboards framerate is at 45.

galopin
Heroic Explorer
FPS is the worst indicator of performance, you need to think in milliseconds, more natural for performance, and to cut things logically, a frame has differents elements, some variable, like the scene geometry, some fixed cost like various blit and stuff like the oculus rift presentation warp.

With DX12, it is easier to measure this stuff. You can see bars on my screenshot of the previous post, they represent GPU timings in my application.

Here is how to do it :

1. Prevent the GPU to idle and keep time stamp consistant cross command lists ( they are always consistant in a single command list ), also query the timestamp frequency, optional, get a sync of value between GPU and CPU to obtain an acurate delta between them
ID3D12Device::SetStablePowerState
ID3D12CommandQueue::GetTimestampFrequency
ID3D12CommandQueue::GetClockCalibration


2. Create a query heap of timestamps plus a regular buffer to resolve them to be used in a shader
ID3D12Device::CreateQueryHeap 


3. In your frame generate time stamps. Create a class that keep track of the hierarchy and remember were the begin and the end are stored in the time stamp heap. optional, you can use BeginEvent( 0, wstr, (wclen(str)+1*2)/EndEvent of the command list or queue, it will organize a VSGD or Nsight capture for debugging purpose.
ID3D12GraphicsCommandList::EndQuery


4. Copy the timestamps to the buffer
ID3D12GraphicsCommandList::ResolveQueryData


5. optional, copy back the data to a read back buffer for cpu access

6. Write a shader that read the values inside and display bars, here, mine, free of charge, i use a vertex buffer that is per instance ( of 4 vertices for a quad ), contains begin, end and depth plus a color, the final vertices are from the time stamps and vertex id

#define MyRS1 \
"RootFlags( ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT | DENY_PIXEL_SHADER_ROOT_ACCESS ), " \
"RootConstants(num32BitConstants=7, b0, visibility=SHADER_VISIBILITY_VERTEX)," \
"SRV(t0)" \
""

struct VSPS {
float4 pos : SV_Position;
float4 color : COLOR;
float ms : MS;
uint isMain : MAIN;
};

struct RootConstants
{
float rtWidth;
float rtHeight;
float barHeight;
uint first;
uint last;
uint freq;
float fpsWidth;
};

ConstantBuffer<RootConstants> root : register(b0);

struct Stamp {
uint high;
uint low;
};
StructuredBuffer<Stamp> stamps : register(t0);

struct IA {
uint3 location : BARPOS;
float4 color : BARCOLOR;
uint subVert : SV_VertexID;
uint instId : SV_InstanceID;
};

uint DiffStamp(uint end, uint beg) {
Stamp a = stamps[beg];
Stamp b = stamps[end];
if(a.low==b.low)
return b.high - a.high;
else
return uint(0xffffffff) - a.high + b.high + 1u;
}
[RootSignature(MyRS1)]
void main(IA input, out VSPS output) {
uint frameLen = DiffStamp(root.last,root.first);

float width;
if ( root.fpsWidth > 0.f)
width = root.rtWidth/ (float(root.freq) / root.fpsWidth);
else
width = root.rtWidth / float(frameLen);
float height = root.barHeight;

float startx = float(DiffStamp(input.location.x,root.first));
float endx = float(DiffStamp(input.location.y,root.first));

float starty = (height + 2) * float(input.location.z + 1);
float endy = starty + height;
float y = (input.subVert & 1) ? endy : starty;
float x = input.subVert < 2 ? startx : endx;

output.ms = x * 1000.f / float(root.freq);
output.isMain = input.location.z == 0;
x *= width;
float2 pos = float2(x, y);

pos /= float2(root.rtWidth, root.rtHeight);
pos.y = 1 - pos.y;
pos *= 2;
pos -= 1;

output.pos = float4(pos, 0, 1);
output.color = input.color;
}

[RootSignature(MyRS1)]
float4 main(in VSPS input) : SV_TARGET {
float l = 1.f;
if (input.isMain && (int(trunc(input.ms))&1))
l = 0.2f;
return float4(input.color.rgb * l, input.color.a);
}


Because the timestamps are stable, you can even collect if you want the gap between before Present and after. I only display my current frame right now, but i could display the N past frame, or also display the same hierarchy from the CPU point of view and visualize the latency from the cpu operation to when it happen on the GPU

lamour42
Expert Protege
Hi galopin,

I would disagree. Looking at milliseconds and what call took how long exactly is of course very important for the developer, but for the end user FPS is the ultimate test. Anything below 75 FPS (for DK2) just feels very bad. Much worse than some missing object or other glitch in the rendered world.

I like you innovative approach for displaying performance data inside your engine. I have also done my share of diagnostic tools that display something directly in VR. For exact timings however, I would recommend to just use the capabilities of VisualStudio 2015. There is not much about performance measurement that they don't provide. Right down to how many nanoseconds each and every GPU call took. Also the CPU measurements can help you finding bottlenecks in your code very fast. Simple things like going through a std:vector element list without using references are revealed very easily.

galopin
Heroic Explorer
I am not talking about the end user but the dev team. And no, visual studio is not enough and no, performance burden is not only for programmers with visual studios.

Without ingame perf tools, how could an artist or designer can understand the impact of things and find what it is important to optimise first ? he can't. How is the shadow map budget, how is the deferred light here, why when i do that my fps drop 10fps, oh it is because a new postprocess goes nut…

The fps is a function of all the inside, and more perf tools is never enough ( more good tools i mean ) to analyse things. The right tool to the right problem. And interactive realtime feedback is a must for gpu frames.

glaze
Honored Guest
"cybereality" wrote:
That's amazing!

Maybe I will revive my engine project and update to DX12.


I liked your engine blog posts.

Nikolas32
Honored Guest
Cool