Your browser (Internet Explorer 6) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.

Posts tagged ‘Stateless’

[Week 4] Behind on schedule, but puffing on.

Since I spent most of the previous week bedridden, I have fallen behind a week and will have to rethink my planning. Yesterday I did pretty decently however, and now my billboards are in place. Cleaning code has gotten pushed back, and now focus will instead be on functionality. I’ll make it less messy when I feel I have the time.

This is the shader program for my first billboarded particles:

  1. [VS] Pass unprojected points directly down to the Geometry Shader
  2. [GS] GL_POINTS as input, output GL_TRIANGLE_STRIP
  3. [GS] Get the non-projected point. (modelView, no projection)
  4. [GS] Add/Subtract extents from the points for each corner of the billboard, then multiply with the projection matrix.
    projection * vec4(pos.x – extent.x, pos.y – extent.y,  pos.z, pos.w), //Lower Left
    projection * vec4(pos.x+extent.x, pos.y – extent.y,  pos.z, pos.w), //Lower Right
  5. [GS] Emit vertices accompanied by an outparameter for UV coords,
  6. [FS] Texture the billboard from a sampled texture!

This is mostly done, I just need to put up a texture on the billboard now. One interesting detail to note for someone new to geometry shaders (like me!), remember to generate points following a point only altered by the modelview matrix, otherwise the billboard will be skewed depending on the screen width and height!

Hopefully things will work pretty alright and I won’t have to overthink. I am slightly curious about the difference in consumption between these Geometry Shader billboards and normal billboarded particles. Maybe I’ll have to measure when I have time.

[Week 2] Slightly ahead, slightly not! Preparing milestone!

Here’s a video of my work so far. I’ve done a very simple particle engine, capable of taking a “gravity” force, an initial impulse of and position of a particle in time. Everything so far is calculated only on the GPU, apart from the original values. I intend to add functionality to it under the upcoming weeks, as well as start working on some more proper visualisations. There is only one thing I’m behind on, and that is spawning the particles on different times! My solution to this will be to spawn the unused particles under the world until they have passed their first ‘particle death’, where they’ll be moved back in the particle simulation. This is everything that is stopping me from having a particle visualisation that isn’t.. very.. pattern-like?

It’s running 11.2 million particles on the GPU at once in real time. Initial transfer requires three float4 arrays to the GPU (as VBOs). How much is that to transfer? 16 * 11200000 *3  / (1024^2)  = (roughly) 512 MB of data. I think? At around 170 MB of data, only the position information by itself would be nasty to send back and forth between the CPU and the GPU all of the time. Especially considering we want to update the particle field in real time. Thirty or more times per second is desireable, and at this point in time, sending data back and forth from the CPU to the GPU would just be a major bottleneck. Which is why altering the VBO directly on the GPU is so handy.


I never thought I would say it; but it would seem I’m one step ahead of my planning. I was very lenient with my time scheduling, and I had been rather paranoid about the troubles of drawing on the GPU. As drawing on the GPU turned out surprisingly simple. I have wasted a lot of time being sick and bothered, so I got less done than I had imagined, but actually sending a vbo to openCL was really easy, especially with the khronos c++ opencl bindings. That was work meant for next week.

[Week 2] (slow) progress

I can draw particles! Yay! No pictures yet, as the particles emitted are far from.. interesting. I will try to set up some simple patterns that would be more interesting to show off.

Particles are currently drawn as a point cloud, where one point is one pixel. I later plan on implementing a shader to replace this with either textures or models.


Also I’ve got some strange light behaviour in my openGL context. The particles look odd when drawn, but this is currently treated as a minor issue. The actual positioning and behaviour of the particles is right.

So for week 2 I’ve had a little success experimenting with examples and even implementing a simple particle emitter. I can now easily render and calculate five million points from home (where I have got a GTX 560 card). The particle emissions are far from.. interesting at this point in time, but it turned out that calculating and drawing them directly from the GPU was relatively simple. Issues have been few apart from depression.

I ultimately gave up on tending to the OclManager class as I came to my senses. It would be an ultimate timewaste unless I want to tidy my code, and tidying in a sense is similar to optimization. We all know the saying “premature optimization is the root of all evil”. I may get back to it later however, it isn’t a key detail. It did however serve a purpose though, as it gave me room to experiment with setting up an openCL context.

[MS1 Concluded] Project Goals

But enough about private issues.

GPU Accelerated Particle System (from here on referenced to as GAPS)  is a 2013 specialization project in the Luleå University of Technology carried out by me, Klas Linde. My goal with GAPS is mainly to learn GPGPU programming practices, to get used to a largely parallel environment. In this very case OpenCL, maybe Cuda.

Particle systems are ideal for such optimizations, and have seen massive performance increases by parallelization in similar projects; and as such I will be making a parallel particle system. My goal is to offload all particle logic, including depth sorting and drawing to the Graphics Processing Unit, which should with some work allow me to integrate an intensive amount of particles at once. If time allows, I will be making the particle system state-preserving, which will allow me to apply forces even after system initiation. This will provide me with plenty of challenges as it is.

I expect to run into at least a few issues regarding parallel computing along the way, as parallel programming is vaguely dissimilar to regular programming towards the CPU. Work will have to be divided into small chunks, with as little workload as possible. As GPUs are well suited for many small workloads rather than one large problem.