What makes particle systems such a great candidate for compute shading is the idea of parallelism. GPUs tend to have significantly higher counts of cores/threads, which in theory, should lead to performance improvements. I struggled for a long time to find proficient sources on this subject, and I really got to dig deep to find explainations, but at last, it is complete!
Here is a demo of what my particle system can handle performance wise.
The demo displays a total of 100,000 particles being simulated at once. And by my rather primitive performance testing, I can happily report a 200% performance increase, compared to the start of the development process!
(The amount of particles crushes the video quality, but I see that as victory!)
All the Buffers!
I only had the chance to look through one example of setting up the buffers needed to transfer data to and from the GPU. The example required 3 ID3D11Buffer:s, 1 ID3D11UnorderedAccessView and 1 ID3D11ShaderResourceView that are all connected with each other. This lead to heavy memory requirements, and I'm guessing there are several different ways to go about this step.
The last and probably easiest step was to move the update function to a compute shader. This wasn't that difficult to write, since the basic syntax of HLSL is quite similar to C++.
Transferring the update of particles to a compute shader, was the part that took the most time during the entire specialization. It took an immense amount of time and work for me to find a way of setting up all the buffers correctly. This resulted in the biggest lesson I've learnt while improving on the particle editor, and I make sure to keep that in mind in the future.