The Trees The Fork Maple Day7 - Shader Struct Alignment

Broke ground for glyph rendering

2023-04-01

Today I pushed more on glyph rendering. I don't have it working quite yet but I made significant progress.

Textures on the GPU

I started the work by copying the quad renderer. From there I initialized a sampler and texture that I added to the bind_group. That texture is then taken as an argument to the fragment portion of the glyph shader and sampled using the quad positioning logic from the existing shader.

#[spirv(fragment)]
pub fn glyph_fragment(
    #[spirv(storage_buffer, descriptor_set = 0, binding = 0)] glyphs: &[InstancedGlyph],
    #[spirv(descriptor_set = 0, binding = 1)] atlas: &Image2d,
    #[spirv(descriptor_set = 0, binding = 2)] atlas_sampler: &Sampler,
    #[spirv(flat)] instance_index: i32,
    atlas_position: Vec2,
    out_color: &mut Vec4,
) {
    let glyph = glyphs[instance_index as usize];
    // Here we have to sample specifically the 0 LOD. I don't
    // fully understand why, but I think it has to do with how
    // the spirv is generated.
    // More details here: https://github.com/gfx-rs/wgpu-rs/issues/912
    let atlas_color = atlas.sample(*atlas_sampler, atlas_position);
    *out_color = atlas_color * glyph.color;
}

Most of this was smooth sailing other than a very strange error: Required uniformity of control flow for IMPLICIT_LEVEL in [20] is not fulfilled because of Expression([1]). I took this to google and found https://github.com/gfx-rs/wgpu-rs/issues/912 which explained the issue pretty clearly. Turns out for whatever reason the bytecode generated by rust-gpu introduces control flow in the above frament. For other reasons I don't fully understand, you also can't sample textures if the level of detail isn't deterministic. So sample_by_lod is required instead of sample so that we can specify that the first level of detail is fine.

Struct Alignment

Once that was figured out, I modified the GlyphInstance struct to include glyph specific information such as the location in the atlas that a glyph should be located.

#[derive(Copy, Clone, Default)]
#[cfg_attr(not(target_arch = "spirv"), derive(bytemuck::Pod, bytemuck::Zeroable))]
#[repr(C)]
pub struct InstancedGlyph {
    pub top_left: Vec2,
    pub atlas_top_left: Vec2,
    pub atlas_size: Vec2,
    pub color: Vec4,
}

Oddly when I added these extra fields, I started getting a strange error in the derived transmutation code complaining that a function added in the derive was trying to transmute the struct between two differently sized types.

I still don't understand why this is related, but in my searching I learned about the strange way struct fields are aligned on the gpu. Turns out fields are aligned into chunks sized by the largest power of two size of the fields. So in the above struct, we can't access this data properly on the gpu because color has 16 bytes but starts at byte 24 which is not aligned.

So to address the alignment issue, an unused _padding field needs to be introduced so that color can start on byte 32 matching the expected alignment... or at least I think.

#[derive(Copy, Clone, Default)]
#[cfg_attr(not(target_arch = "spirv"), derive(bytemuck::Pod, bytemuck::Zeroable))]
#[repr(C)]
pub struct InstancedGlyph {
    pub top_left: Vec2,
    pub atlas_top_left: Vec2,
    pub atlas_size: Vec2,
    // Need a Vec2 of padding here so that the first 4 fields
    // Are some multiple of 16 bytes in size.
    // Vec2s are 8 bytes, Vec4s are 16 bytes.
    pub _padding: Vec2,
    pub color: Vec4,
}

This compiles just fine so I believe the issue is resolved. I may work on a proc macro for automatically inserting this padding so that we don't have to worry about it, but that may be a job for another day. Next up tech wise is to add swash and shape individual glyphs into image binary data. Once that's in place I can then use etegere to keep track of atlas locations and write the data into the atlas itself.

Till tomorrow,
Kaylee