DannyD2 Thanks! That description is very helpful. All my displays are 10-bit, so I can only guess how the test image behaves at a lower bit-depth. Also, it is twice as hard to see the banding divisions at each step in bit depth, because the brightness modulation depth also divides in two for each bit added. So at 9-bit or greater I can just barely make out the edge between bands. It requires full brightness, zooming the image fully, and scrolling left and right to see all of the details.
Your description makes me think Apple may first be spatially dithering down to the native panel depth+1. And then performing a temporal dither on that to drop down the remaining bit depth needed to match the native display. This makes some sense because the spatial dither could produce a lot of static noise, so they can try and hide that by targeting one bit depth deeper. And with this approach, the temporal dithering applied only needs to modulate by one level per frame. So in total, they have hidden the grain of spatial noise used to convert from 32-bit down to 9-bit, and then hide the flicker of temporal dithering by only dropping a single bit temporally, from 9-bit to 8-bit. It’s just a possibility that makes some sense. Also, that system would work well with ProMotion, since it would flip levels on nearly every second frame.
I’m thinking we could learn something if someone used a capture card and frame-diff on my test image. I’m imagining very little temporal noise in the diff video for all of the native bit level gradients, and then a sharp increase in noise for the levels above that. Also, I imagine a pattern of noise in waves could be seen, in the higher bit-levels.