| This provides some background the design of the generated headers. We |
| started out trying to generate bit fields but it evolved into the pack |
| functions because of a few limitations: |
| |
| 1) Bit fields still generate terrible code today. Even with modern |
| optimizing compilers you get multiple load+mask+store operations |
| to the same dword in memory as you set individual bits. The |
| compiler also has to generate code to mask out overflowing values |
| (for example, if you assign 200 to a 2 bit field). Our driver |
| never writes overflowing values so that's not needed. On the |
| other hand, most compiler recognize that the template struct we |
| use is a temporary variable and copy propagate the individual |
| fields and do amazing constant folding. You should take a look |
| at the code that gets generated when you compile in release mode |
| with optimizations. |
| |
| 2) For some types we need to have overlapping bit fields. For |
| example, some values are 64 byte aligned 32 bit offsets. The |
| lower 5 bits of the offset are always zero, so the hw packs in a |
| few misc bits in the lower 5 bits there. Other times a field can |
| be either a u32 or a float. I tried to do this with overlapping |
| anonymous unions and it became a big mess. Also, when using |
| initializers, you can only initialize one union member so this |
| just doesn't work with out approach. |
| |
| The pack functions on the other hand allows us a great deal of |
| flexibility in how we combine things. In the case of overlapping |
| fields (the u32 and float case), if we only set one of them in |
| the pack function, the compiler will recognize that the other is |
| initialized to 0 and optimize out the code to or it it. |
| |
| 3) Bit fields (and certainly overlapping anonymous unions of bit |
| fields) aren't generally stable across compilers in how they're |
| laid out and aligned. Our pack functions let us control exactly |
| how things get packed, using only simple and unambiguous bitwise |
| shifting and or'ing that works on any compiler. |
| |
| Once we have the pack function it allows us to hook in various |
| transformations and validation as we go from template struct to dwords |
| in memory: |
| |
| 1) Validation: As I said above, our driver isn't supposed to write |
| overflowing values to the fields, but we've of course had lots of |
| cases where we make mistakes and write overflowing values. With |
| the pack function, we can actually assert on that and catch it at |
| runtime. bitfields would just silently truncate. |
| |
| 2) Type conversions: some times it's just a matter of writing a |
| float to a u32, but we also convert from bool to bits, from |
| floats to fixed point integers. |
| |
| 3) Relocations: whenever we have a pointer from one buffer to |
| another (for example a pointer from the meta data for a texture |
| to the raw texture data), we have to tell the kernel about it so |
| it can adjust the pointer to point to the final location. That |
| means extra work we have to do extra work to record and annotate |
| the dword location that holds the pointer. With bit fields, we'd |
| have to call a function to do this, but with the pack function we |
| generate code in the pack function to do this for us. That's a |
| lot less error prone and less work. |