crocus misrendering of the week

 I've been chasing a crocus misrendering bug show in a qt trace.

The bottom image is crocus vs 965 on top. This only happened on Gen4->5, so Ironlake and GM45 were my test machines. I burned a lot of time trying to work this out. I trimmed the traces down, dumped a stupendous amount of batchbuffers, turned off UBO push constants, dump all the index and vertex buffers, tried some RGBx changes, but nothing was rushing to hit me, except that the vertex shaders produced were different.

However they were different for many reasons, due to the optimization pipelines the mesa state tracker runs vs the 965 driver. Inputs and UBO loads were in different places so there was a lot of noise in the shaders.

I ported the trace to a piglit GL application so I could easier hack on the shaders and GL, with that I trimmed it down even further (even if I did burn some time on a misplace */+ typo).

Using the ported app, I removed all uniform buffer loads and then split the vertex shader in half (it was quite large, but had two chunks). I finally then could spot the difference in the NIR shaders.

What stood out was the 965 shader had an if which the crocus shader has converted to a bcsel. This is part of peephole optimization and the mesa/st calls it, and sure enough removing that call fixed the rendering, but why? it is a valid optimization.

In a parallel thread on another part of the planet, Ian Romanick filed a MR to mesa fixing a bug in the gen4/5 fs backend with conditional selects. This was something he noticed while debugging elsewhere. However his fix was for the fragment shader backend, and my bug was in the vec4 vertex shader backend. I tracked down where the same changes were needed in the vec4 backend and tested a fix on top of his branch, and the misrendering disappeared.

It's a strange coincidence we both started hitting the same bug in different backends in the same week via different tests, but he's definitely saved me a lot of pain in working this out! Hopefully we can combine them and get it merged this week.

Also thanks to Angelo on the initial MR for testing crocus with some real workloads.


Popular posts from this blog

h264 video decoding: i-frames strike back

video decode: crossing the streams