Your browser (Internet Explorer 6) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.
X

Archive for March, 2014

Project conclusion and preparation for the final presentation as well as post mortem results

So I’ve been a bit busy lately with optimizing my code a little bit, I’ve also added some data output and built a very basic scene to walk around in. I’ve added 2048 point lights for final presentation. Among the things I’ve done is also to improved my forward renderer a bit. Before in my forward shader I used a depth buffer that I filled in a pre-z pass  and then used it to calculate the min/max Z value per tile, which is what you should be doing, however during my forward shader pass I then used the screens depth buffer which made the fps drop when I looked down the scene where a row of houses was placed. Generally I had 60 fps except in those cases, now in order to use transparency you will need to reconstruct the pre-z pass a bit since you need to check for transparent objects. In any case, the way I solved it since I wanted 60 fps for both techniques at all times during the final presentation is that I reused the depth buffer from the pre-z pass which meant I only had the closest fragments to color, which improves performance but like I said makes other things harder such as transparency. I’m happy with these results and since I do not have time to test MSAA and transparency I will leave that to the future. Here’s both techniquse, the lights still look ugly, I guess I just always forgot to change the attenuation, in any case here’s how it looks:

bothworking

 

And here’s a bit of post mortem thoughts

Post Mortem

The Good

During this project I’ve learned a lot more about OpenGL in general as well as increasing my knowledge about shading which is one of the goals I set for myself at the start of this project. Since our graphics course only touched on a few of the more interesting subjects I found it very fun to relearn even some of the basic things I didn’t fully grasp back then. I’ve also learned that the OpenGL wiki and khronos function specification webpages are good to read at least three times everytime you do something new as it saves you a lot of silly and sometimes time consuming mistakes. I’ve also increased my familiarity with switching between view spaces and transforms whereas I previously only knew the basics of them, I now feel comfortable changing positions to the right coordinate spaces in order to do say, per tile frustums or going back and forth between spaces. Learning how to use compute shaders was probably the single most important factor here, as I had to learn not just how to use them but how memory is synchronized as well as when and how you can read and write to uniforms and textures. In general I feel like I’ve become a lot better but I also know precisely what things I need to improve my knowledge upon and how much I still have to learn.

The Bad

The most time consuming things and most frustrating things about the project always happened because I hadn’t read thoroughly enough, like for example I had written code that ”worked” in theory beacuse everything was correctly set up but I may have used say texture 14 when my graphics card could only support a few render targets and it could’ve easily been fixed by just using texture 0, which caused hours of frustration simply because I focused on how it was supposed to work rather than going back and reading the documentation thoroughly. In fact most of my mistakes were made this way, but in a way it’s a good thing because I’ve learned that until you do it right, you haven’t done it right, and if something works the first time you did it without really reading specifications thoroughly errors might occur later on. The general lesson I’ve learned here is that you should always be 100% sure of what you’re doing since it saves you trouble later on. I probably also spent more time on the culling methods than I should have as I first tried my own solution, then a clip space frustum culling before settling with a geomeotry based approach. While I did this and tried their performances I could’ve spent more time on a little more interesting things like MSAA and transparency.

 

Overall it’s been a pretty fun experience and I’ve really increased my knowledge a lot, and I’m going to continue work in this and improve it as well as keep on adding new features and learn even more things.

Normal mapping and a new scene

The last two days I’ve been changing up my scene a bit and added normal maps, I’m using some free models I found and then just put 50 or so houses in a scene which I might use for the final delivery. If only I was a graphical artist :C, here’s how it looks like:

normal mapping

 

Now I’ve mostly got some fixing up to do, add some output of how many lights/fps and which technique and then fix up the code so I can use it in the future.

An update

So I forgot to post about my progress yesterday in my rush to get home and watch the finale of True Detective (which is an amazing show for the record). I’ve loaded in some models, placed a few lights and have pondered on a few things. Namely without anti aliasing my deferred shader (to the right) looks horrible while the forward one (left) looks decent enough. Also I’ve come to realise that everything will look like shit if I don’t load in some specular/normal maps so I have those three things to look at. I may even resort to using assimp just so I can load bigger levels with all their textures and what not since I currently just triangulate anything I want to load with my primitive obj loader. So yeah, fixing AA and normal/specular maps is my priority. After that I guess I will optimize the code and restructure everything since the code is ridden with test things and everything that isn’t the main paintgl loop is a steaming mess. Anyway, here’s a picture:

forward and deferred

Tiled Forward Shading

I finally finished implementing both the tiled techniques, here it is running on 2048 and 4096 point lights respectively:

tiled forward 2048

tiled forward 4096 lights

So how about that implementation huh? Well, let’s go through it shall we?

Well first of all I started by doing a pre-z pass which fills the depth buffer, then I run the same compute shader as before where I construct a clip space frustum and check every light if they intersect the tile or not. Here’s where the main changes happen from tiled deferred. In the deferred technique I just looped through the lights and calculated lighting and then stored the final image in a texture which I blitted to the screen afterwards. Now I store the lights in a simple list that I then access from the forward shader, the setup looks a little something like this:

explanation

In this example there are 6 work groups in the x dimension and 2 in the y dimension. There are also a max limit of 4 lights per tile. So the entire grid setup is stored in two arrays, one that stores the  light index of lights that affect that tile and then one array that stores the size (or amount of lights) that affect a tile. These two arrays along with the original point light array is then accessed in the forward shader where you calculate which tile the current pixel is in by just dividing the fragments position by the work group size.  Using this setup you can just set how many lights you want to calculate per tile by increase the array setup, I should also mention that you could easily store all the lights in a tile, so say you have 1024 lights you just allocate that much memory, since it’s only one uint the size will still remain relatively small, which is also the reason for storing all the point lights in a seperate array as otherwise it would become quite big. I’m not sure if this is the best solution but it works rather well for lots of lights. I’ve since looked into other techniques where a linked list is stored which I might implement or at least understand how it works. For now I believe this is one of the simplest ways to implement this technique. From now on I will just make sure everything looks good for the final presentation as well as possibly implementing MSAA and running some tests on both techniques and see how they perform.

Some adjustments

After I revised my light calculation I was able to get a smaller radius on the lights which enabled me to fully utilize the tiled shading technique. I was able to render 4096 lights without any per tile limit on the lights, albeit I did spread the lights out over a 300×300 area. I was even able to do 8196 with mild lag but I wasn’t able to spread the lights over a far enough area so when you looked down the scene some tiles flickered due to the imposed 40 light calculations per tile. Also I spent some time tinkering with the lights to get them to look good but in the end I realised I was wasting time and I’ll do it later, you could probably optimize the code even further but that comes later. Here’s a fancy picture of it running on 4096 lights.

4096 lights

Tiled Deferred Shading

I’ve almost completely forgotten about posting, fortunately I have something to show. I’ve implemented tiled deferred shading using compute shaders. I decided to use a single compute shader since they seemed far better for performance than constructing the light grid on the CPU. I’ve implemented as suggested by http://dice.se/wp-content/uploads/GDC11_DX11inBF3_Public.pdf .

First I calculate the min/max z value for each tile illustrated below, the differences in the values are mostly visible around the edges.

minmax

 

Then a view frustum is calculated for each tile and then the light is checked against the frustum as normal (the details are probably best explained in the aforementioned paper). After that each work group switches to process lights, meaning that a 16×16 work group can process 256 lights in parallel, which enables some quite fast computations.  My visualisation of the tiles affected by a lights radius below is probably an eyesore but I didn’t care much for how it looked, but the yellow tiles show which tiles are affected by a light after the frustum culling

tiled deferred light volumes

 

And here’s the final result showing the grids and running 1024 lights:

showing tiles

The only problem I’ve encountered aside from learning compute shaders and setting everything up correctly was that the radius isn’t calculated very well using my current setup. One of the key things you can do with tiled shading is to set a maximum number of light that you can compute per tile, I tried setting a limit of 40 which enhances performance by a ton with lots of lights in a small area, only problem is that the radius of the light is so big that the max lights per tile calculation stops working properly, illustrated below:

troubles

 

If I set the filter to 40 lights the above happens and if I remove it everything runs fine, I will probably look into a better calculation for the radius as I’d like to have a lot of lights in a small area just for testing at least even though you might not usually cram more than 40 lights into a tiny area.