OpenCL Slisesix PDF Print E-mail

 

The goal when I started this demo was to make the user able to navigate into the world originally created by inigo quilez for his 4k SliseSix demo (see it here). Then, I have decided to add the possibility to render high quality pictures of static views. Here are some important points about this demo:

  • THE SLISE SIX demo is not my work! I just wanted to make it clear in case people would think I stole someone else work. The slisesix demo is from Inigo Quilez. The world is generated by ray marching using distance field. I just wanted to render the same world in real time with a controlable camera. To this aim, I have used OpenCL 1.0. I simply took the simplified SliseSix code from the shader toy web application.
  • I was very frustrated that the demo run barely interactively at 640*480 and real-time at 320*240 (GeForce GTX275). So I have decided to add a high quality version.
  • I first started to develop on a RadeonHD4650 and the ADM streaming SDK supports float3 (OpenCL1.1). Then I moved to the GTX275 so I had to change every float3 to float4. But, in this case, the kernel sometimes ended-up killed by the driver because it was taking too much times. Now I know that if you only have one graphics card, you can only execute a kernel for like 2 seconds. Without knowing this, I was supposing that the kernel was using too many local memory (thus less parallel items, etc.) so I ended up moving every float4 to single float. The final demo is released in this state.
  • At the time I was working on this demo, I was previewing FXAA3 method thanks to Timothy Lottes: this is a post-process screen-space anti-aliasing algorithm.

The source code together with win32 executables of this demo can be downloaded here. (last update 09/07/2011 including bug corrections)
Many thanks to Eyebex for his bug corrections

 

The interactive demo

The interactive demo feature the original shader of the shader toy application plus:

  • interactive camera
  • volumetric fog on the ground
  • animated, glowing creature and color correction. I wanted the creature to "suck the light around her" periodically.
  • tentacles go into the ground to not terminate abruptly

I have also compiled some simpler version of the scene in order to make the demo runs faster.

 

The high quality demo

The high quality version of the demo allows you to generate high resolution screenshots of the scene with two additional effects: depth-of-field and single light scattering.

The depth-of-field blur effect is achieved by averaging several views of the scene. For each of these view, the camera center of projection is slightly moved. At the same time, rays are forced to pass through the same position in space at the level of the plan positioned at the distance of focalization. The offsets applied to the camera position follow a Poisson-disk distribution. No bokeh effect.

Here are some screenshots of pictures with depth-of-field applied using several disparity values (the higher disparity, the more it looks like a macro lens has been used to take the shot):

 

The high quality version also features single light scattering. In fact, I wanted to get some kind of god rays coming from a light source hidden behind a column. Single light scattering is a good and easy way to add complex light behavior in participating media. A good description of single light scattering can be found in this paper (but I am not using the closed-form solution here).

Computing single light scattering requires integrating incoming light along the primary ray cast from the camera. The amount of light reaching the camera photo-receptors will depend on the attenuation from the source due to the participating media, the phase function describing how the light bounce on particles and the scattering properties of the participating media. In my case, I have decided to use the schilck approximation to Henyey-Greenstein phase function. Also, I ignore light attenuation from the sampled point on the primary ray to the camera (this is deliberate to get the overall look I desired).

The problem with computing this integral is that it is very costly, especially when computing the light source visibility using ray marching + distance field. To accelerate the visibility test, I ended up using a shadow map generated using a lattitude/longitude mapping from the light point of view. I know that ray distribution is not uniform using this mapping but it was simple enough to not deal with cube mapping face selection and the result was visually flawless.

The problem was that even under this condition, the process was too slow to be used on the GPU so I decided to run the single light scattering simulation on the CPU. Really easy given the fact that OpenCL syntax is almost the same a C. I also enabled OpenMP to get more speed (I have 4 cores on my i5). Using such macro-optimization methods together with a shadow map resolution of 512*256 really accelerated the process as you can see in the following table:


320*240 640*480
2nd ray cast 2min 37s 10min 44s
Shadow map 4s 9.4s
shadow map + OpenMP 1.3s 3.1s

 

Here are two screenshots of the single light scattering effect:


The shadow map generated from the light point of view

 

Finally, here is a screenshot with all effects: scene + depth-of-field + single ligth scattering.

 

Screen-Space Anti-Aliasing: FXAA3

After SSAO, SSAA is THE new buzz word! Overall, screen-space-something is the buzz word for the last few years! That make me laught but we have to face it: despite flaws related to the screen-space approach, many effects can benefits of that. Anti-aliasing is such an effect.

SSAA allows you to filter the detected edges of a single image in order to remove jaggies. I found fxaa3 really good as it only requires per pixel color and luma (no depth buffer). All fxaa3 credits goes to Timothy Lotte: His blog and Twitter. More information about SSAA in an upcoming SIGGRAPH course. The demo here feature only a preview of the algorithm: I will update the whole file later. Please visit Timothy's blog to get the lastest version of the shader.

 

Improving the demo

If you want to, you can get the source code and improve it at your will. Here are some features that could be added:

  • Micro optimization (SSE for the CPU version)
  • Render only a sub-space of the screen for each kernel call, distributing the computation on several frames. Should stop windows aborting the kernel execution.
  • real volumetric fog with varying density
  • Add more animated stuffs
  • Build your own scene!

I think I will also come up with my own scene soon with some points above implemented. Feel free to modify the demo and improve it using you own artistic view! If you tell me, I will add a link to your modifications and/or new scene produced using my framework on this web page!

Last Updated on Saturday, 09 July 2011 17:01