cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Compute shader stops working in build when there are too many instructions compiled

evansolanis
Level 2

Hi friends!

I have a couple compute shaders in my unity application. Everything works well in the Unity editor using the oculus link, but when I build and run on my headset - one of the compute shaders reliably fails. After toying around with the code a bit, I noticed that this error seems related to the number of compiled lines of code in the shader.
The original kernel (that doesn't work in the build) was: 

        float nx = states[id.xy].x;
	float ny = states[id.xy].y;
	float nz = states[id.xy].z;
	if (prevStates[id.xy].x >= 1) {
		nx = 0;
	}
	if (prevStates[id.xy].y >= 1) {
		ny = 0;
	}
	if (prevStates[id.xy].z >= 1) {
		nz = 0;
	}
	states[id.xy] = float4(nx, ny, nz, 1);

 
which compiles to

Compiled code for kernel SetToZero:
#version 310 es
#extension GL_EXT_texture_buffer : require

layout(binding=0, rgba32f) highp uniform image2D states;
readonly layout(binding=1, rgba32f) highp uniform image2D prevStates;
ivec4 u_xlati0;
vec3 u_xlat1;
bvec3 u_xlatb1;
layout(local_size_x = 8, local_size_y = 4, local_size_z = 1) in;
void main()
{
    u_xlati0.xyz = floatBitsToInt(imageLoad(states, ivec2(gl_GlobalInvocationID.xy)).xyz);
    u_xlat1.xyz = imageLoad(prevStates, ivec2(gl_GlobalInvocationID.xy)).xyz;
    u_xlatb1.xyz = greaterThanEqual(u_xlat1.xyzx, vec4(1.0, 1.0, 1.0, 0.0)).xyz;
    {
        ivec4 hlslcc_movcTemp = u_xlati0;
        hlslcc_movcTemp.x = (u_xlatb1.x) ? int(0) : u_xlati0.x;
        hlslcc_movcTemp.y = (u_xlatb1.y) ? int(0) : u_xlati0.y;
        hlslcc_movcTemp.z = (u_xlatb1.z) ? int(0) : u_xlati0.z;
        u_xlati0 = hlslcc_movcTemp;
    }
    u_xlati0.w = 1065353216;
    imageStore(states, ivec2(gl_GlobalInvocationID.xy), intBitsToFloat(u_xlati0));
    return;
}

 

The modified kernel is

if (prevStates[id.xy].x >= 1.0f) {
		states[id.xy] = float4(0,0, 0, 1);
	}

which compiles to

Compiled code for kernel SetToZero:
#version 310 es
#extension GL_EXT_texture_buffer : require

writeonly layout(binding=0, rgba32f) highp uniform image2D states;
readonly layout(binding=1, r32f) highp uniform image2D prevStates;
float u_xlat0;
bool u_xlatb0;
layout(local_size_x = 8, local_size_y = 4, local_size_z = 1) in;
void main()
{
    u_xlat0 = imageLoad(prevStates, ivec2(gl_GlobalInvocationID.xy)).x;
    u_xlatb0 = u_xlat0>=1.0;
    if(u_xlatb0){
        imageStore(states, ivec2(gl_GlobalInvocationID.xy), vec4(0.0, 0.0, 0.0, 1.0));
    }
    return;
}

 

These should be functionally equivalent (in the x coordinate), but only the second will run when I build. My first thought is that there must be some limitation on the number of compiled lines of code - is this true? If so, what is that limit? The scene I'm working with has two compute shaders (5 kernels total), a fairly simple surface shader, and a fairly complicated skybox shader.

Other things I've tried:
- using smaller input/output textures
- modifying the working kernel to preserve the y and z coordinates (this also fails)
- banging my head on my desk (also did not work)

Running on an oculus quest 2. Hope this question makes sense - and honestly hope I'm wrong about the shader size limitation ๐Ÿ˜›
Let me know if I can provide any more useful info, thanks โค๏ธ

0 REPLIES 0