Water Grid And Dragon Curve ON GPU!


Before, I used to just sample some small region eg 1024 x 1024 out of a premade dragon curve texture that was larger eg 2048 x 2048. So I had written code that was "ok enough" to pull texel data from the 2048 x 2048 big texture and throw it into my smaller one and just would simply push that to render. One thing that killed my CPU on this was that I A: didn't use any SIMD and B: was adding some random amount to the texel when writing it to the dest buffer. So it looked like Dest = Source + ColorAdd; pretty much.

Putting this on the GPU atleast when using Directx12 was a little involved. It mostly just involved me remembering how I wrote all my code to easily use DX12 IE more so setup DX12 but I still ended up having to write an entirely new DX12 pipeline object thingy as I needed all new shader code to actually draw to just a background quad and add some color to it (if color was available).
I recently spoke / wrote about how I got this part working about 2 days ago but had a bug pop up yesterday.  Turns out the bug was that I didn't actually add the random color to the output texel in the pixel shader stage and the API was confused on why I wasn't even uploading the data I said I was going to / not even use it. All's fixed and good though now :) This alone got me like 10-20 frames back on average so even in a debug build, I was going about 20-30 but after this it was 40-50.

The other thing I did to get a speedup was to move the water grid's drawing code to the GPU. Before, the code would simply look up where a graphic vertex lied on the eulerian water simulation grid and would use the density found at X Z position in the water sim grid and use that to determine the vertex's height and color. So when getting a vertex ready for drawing, you'd do a check to see what color and select color then just raw write the vertex's Y value to be density.
So that's quite simple and should be easy to move to the GPU right? Nope, not at all!
Or I guess not at all if you forgot that the GPU or atleast DX12 has various limitations put in place (for good reasons in most occassions from what I can tell but can't really tell much since GPUs are entirely opaque / don't have an ISA so you don't really know wtf is ever going on atleast to my current extent of knowledge on this matter). The major limiation that messed with me was the fact that the constant buffers (cbvs) aka just some memory that is generic that allows you to send whatever you want to the GPU ended up required two things from me.
A: I needed to split the 10000 float buffer up into 3 buffers at least as per cbv, you can only have 4096 of whatever type you're sending. And
B: I didn't realize / forgot (because when I wrote a material upload thingy before, I had to pad each data type by 4s so I "knew") that if you have an array of floats, the GPU will interpret a single vertex index IE 0 as pulling 4 floats from the array. So vertex index 1 into the array would actually start at the float array's [4]th value to the [8]th value.
So the solution to A for me was to just have 3 buffers that were all maxed out to the [4096] count and the type I used was float4 in hlsl aka a vec4 of 32 bit real numbers which solved B. Unfortunately I solved A quite early on and kept thinking my bug was with the fact that I partitioned my huge float array up into three buffers so debugged that quite a bit until I realized that B was the culprit so the solution I just wrote took quite a bit of time for me to arrive at.

Anyways, it all works now! I also started to write better audio mixing IE actual audio mixing as I never had such a thing and have been getting bit quite bad by it since i've never wrote one before. Plan is to have it working in some form tomorrow!!!

Thank you for reading and please let me know your thoughts on the game thus far! Have awesome day :]

Files

Funmi_Zip.7z 3 MB
Oct 31, 2021

Get Funmi

Download NowName your own price

Leave a comment

Log in with itch.io to leave a comment.