Lab 8: a CUDA particle simulator
by Fuzziqer Software



This project was created as part of Caltech's CS 101C class (graphics programming) in spring 2010. The original idea (from Tamas Szalay) was a tornado simulation, which involved a Navier-Stokes solver using a few overlapping fields (pressure, velocity, density, etc.) and some special boundary conditions. This proved to be fairly difficult; I instead ended up implementing a fluid solver based on Keenan Crane's chapter in GPU Gems 3 (http://http.developer.nvidia.com/GPUGems3/gpugems3_ch30.html), which was considerably easier.

I put the fluid's velocity and pressure in 3D textures (one is a vector field, the other is a scalar field), and used these to compute new velocities and pressures for each time step, storing the results in a raw data array, which is then cudaMemcpy'd to texture memory. All velocities are initially 0, and the initial pressure is the sum of 4 gaussians around 4 random points in the cube.

The original idea was to render an isosurface of the pressure field, to see how the fluid evolved. This proved to be both problematic and not very interesting, so I scrapped that idea, and rendered fog instead. Red fog represents areas of high pressure, and blue fog represents areas of high velocity (purple is both). The accumulation constants may be too high or too low (I haven't messed with the fog rendering code in a while); they can be changed in the constants section of lab8_kernel.cu. Higher values for the constants means denser fog.

Once the fluid seemed to diffuse in a reasonable manner, I added the necessary boundary conditions for the tornado simulation (fixed velocities on the four sides of the cube in circular directions, high pressures on the sides, low pressure and high upward velocity in a circular hole in the top), and added a bunch of massless, chargeless particles. The particles do seem to swirl and rise in the fluid under these boundary conditions, though perhaps not like a real tornado.

It should be pretty obvious where most of the code is. The kernels (GPU functions) are in lab8_kernel.cu (rendering kernels first, then computation kernels), and the rest of the stuff (initialization, command line/hotkey handlers, OpenGL and kernel calls, etc.) is in lab8.cu.

Now for the fun stuff: how to actually build and run this thing.



You can build it by just putting it in its own folder in your CUDA SDK's src or projects folder, then running make. The binary (called lab8) should end up wherever they usually end up for CUDA SDK programs - for me (on Mac OS X), it's <CUDA SDK>/bin/darwin/release.



Don't just run it, though: you'll probably want to change some of the options via the command line. Here are all the parameters:

-b: benchmark mode: after each update, prints the time that the update took.

-s<size>: sets the pressure and velocity fields' dimensions to cubes of dimension <size>. <size> must be a multiple of 16, though it's probably safer to use multiples of 32. The default is 128. This setting greatly affects the amount of GPU memory required by this program, so if you get out of memory errors, try a smaller field dimension.

-p<numParticles>: sets the number of particles, and disables fog rendering. (You can re-enable fog rendering by pressing f.) <numParticles> must be 0 or a multiple of 128; the default is 0 (which means particles are disabled). Each particle has a very small memory and time footprint, so you can easily have 2097152 of them with little performance impact.

-d<deviceID>: tells the program to use a certain CUDA device.

-h<height> / -w<width>: sets window height and width (default is 512x512). Larger windows mean more time spent in rendering, but also mean prettier pictures.

-t<timeStep>: sets the length of the time-step. 0.001 (the default) works well; for more detail and slower progression, use a smaller step.

-z<transparency>: sets the transparency (a floating-point value) for rendering particles. The default is 17. For less particles, you should increase this value -- 17 is about right for an 896x896 window and 1048576 particles. A higher transparency value means dimmer particles.

-v<videoPrefix>: saves every successive frame as <fideoPrefix><frameNumber>.ppm. You can then use a program like ffmpeg to combine this sequence of images into a video file. Be careful when recording video: what you see in the window is exactly what the video will look like! Don't resize the window or some images will have different sizes. If you press any hotkeys, the change will be reflected in the video.

-r<renderFlags>: sets the initial renderFlags (useful when recording video, so you can start with a custom setup). renderFlags is a 32-bit hexadecimal number. Choose from the following constants, and OR them together to get your desired renderFlags:
RENDER_FOG               0x00000020 // Renders pressure/velocity fog
RENDER_ISOMETRIC         0x00000040 // Uses isometric projection instead of frustum projection. (Note that frustum projection is currently not working properly.)
RENDER_RAND_LOC          0x00000080 // Particles that leave the field are re-created at a random location, rather than at the edges
Particle color constants:
RENDER_COLOR_VELOCITY    0x00000000 // Red = slow particles, blue - fast particles
RENDER_COLOR_VEL_SPECT   0x00000008 // Particle's (r, g, b) color are (x, y, z) of its velocity
RENDER_COLOR_POS_SPECT   0x00000010 // Particle's (r, g, b) color are (x, y, z) of its position
RENDER_COLOR_GREEN       0x00000018 // Particles are green



And of course, you can press some keys while it's running and move it with the mouse. The keys that do things are:

Esc: exit the program

C: change the color function used to color the particles. It can be a function of the magnitude of the velocity (fast particles are blue, slow particles are red); a function of the velocity itself (x component is red, y component is green, z component is blue; in effect, the color of the particle depends on which direction it's moving); a function of the position (along similar lines); or just green.

F: enable/disable rendering of pressure and velocity fog. Honestly, rendering this fog takes forever, so it's automatically disabled by default if you've put some particles in the system via the command line.

I: change the view mode between isometric and perspective. The fog renders correctly in perspective mode, but the particles don't - something about the view/projection transformation (done in RenderParticlesK) is wrong, but I couldn't figure out what it was. For this reason, isometric view is the default.

L: change where particles are created. When a particle leaves the cube, it's "destroyed" and "recreated" somewhere else - either at a random position inside the cube, or a random position on the edge of the cube. Use this option to switch between the two.

I didn't really have time to optimize this... fluids can be difficult to get right, so I spent most of my time on that. I'm sure this program could be a lot faster than it is; one thing I wanted to try was to not use textures to store the previous pressure/velocity data, and instead just use two raw buffers for each field and flip between them. This would eliminate the texture pipeline from the program entirely, and may make computation faster, but would almost certainly make fog rendering slower (since the rendering kernel would have to manually interpolate values from the velocity and pressure data). Fog rendering is by far the slowest part of the process already (you can clearly see the difference when it's enabled or disabled).



A little goodie: back when I was still rendering the pressure isosurface, I made a mistake once when I forgot to write anything to the pressure texture before rendering it - so I was rendering an isosurface of a bunch of uninitialized data, left over from other programs. It looked kind of cool, though, so I saved the source then and included it here (lab8_uninit.cu and lab8_uninitkernel.cu). You can build it by running make -f Makefile.uninit, and the resulting binary is called lab8_uninit. It doesn't accept any of the command line parameters (except -d, to choose a CUDA device, and -s, telling it how much uninitialized data to render), and the hotkeys are different:

Esc: exit the program

[ and ]: move the light along the z axis
arrow keys: move the light along the x/y axes

+ and -: change the value of the isosurface being rendered by 0.01 (default is 0.5)

c: change the color function (there are 4 color functions; just play with them a little)

r, t, and y: enable/disable ambient, diffuse, and specular lighting

Try running it after running a few different CUDA programs - the results could be interesting, depending on what data the other programs leave in the graphics card's memory.