08-10-2024, 05:17 PM
Hi!
Sorry for the delay. I'm fixing some issues I found in another areas so I haven't checked the forums in a while.
Your suggestion is welcomed as always.
However, the color extraction to pixel is the inner part of the loop, where most of the CPU time happens, so it must be as tight as possible. Adding all these extra per-pixel operations would cause major performance hit. And there's also a comparison in your solution. Intel x86 architecture has the CMOV opcode (conditional move). The compiler must use it when translating the conditional assignment, and in some cases it can have better performance than branched move. But other architectures like ARM don't have this opcode, so the conditional move is translated to a branched move.
Sanitizing the tileset data based on palette size won't work either, as you can reassign palettes on the fly. You can even modify tileset data on the fly, so assuming everything will remain static after initial load, is not realistic.
Probably the best option is allocating always 256 colors for each palette, so the indirection will never fail. It's a bit of wasted space, but so small compared to current memory availability, that I think this is the best tradeoff to protect bad authored assets while keeping performance.
Sorry for the delay. I'm fixing some issues I found in another areas so I haven't checked the forums in a while.
Your suggestion is welcomed as always.
However, the color extraction to pixel is the inner part of the loop, where most of the CPU time happens, so it must be as tight as possible. Adding all these extra per-pixel operations would cause major performance hit. And there's also a comparison in your solution. Intel x86 architecture has the CMOV opcode (conditional move). The compiler must use it when translating the conditional assignment, and in some cases it can have better performance than branched move. But other architectures like ARM don't have this opcode, so the conditional move is translated to a branched move.
Sanitizing the tileset data based on palette size won't work either, as you can reassign palettes on the fly. You can even modify tileset data on the fly, so assuming everything will remain static after initial load, is not realistic.
Probably the best option is allocating always 256 colors for each palette, so the indirection will never fail. It's a bit of wasted space, but so small compared to current memory availability, that I think this is the best tradeoff to protect bad authored assets while keeping performance.