Raytracer Optimization, and a dabble of multithreading

We've been given a raytracer program and been tasked with optimizing it since an i7-6700 clocked @ 3.4 Ghz (4 cores, 8 threads) was pushing out a consistent 47 seconds to render the scene (not that it was using multithreading to begin with).
Taking the same project and establishing a baseline on my own machine brought this number to around 100.12 seconds, on average, (yay for 2013 AMD technology, the FX-6300 slightly blowing its own dust off with its strong 3 cores and 6 threads) mind you this is with a moderate load of chrome tabs, and visual studio open. Nothing major like gaming was happening in the background with both machines...

Since the framework supported TIMING_PER_PIXEL, I thought I would take advantage of that and see where the CPU struggled rendering parts of the scene. And maybe perhaps figure out where this occurred and use multithreading for those parts of the scene...
The brighter the colour indicates the a longer amount of time was spent rendering the row/column compared to the other ones, so possibly rendering these parts in separate threads while others work around them might be beneficial...

Below is the code for how the multithreading is going about itself:

std::thread thread([] {  
    //Loop over all the pixels in a specific area...
        //Retrieve the colour of the specified pixel. The math below converts pixel coordinates (0 to width) into camera coordinates (-1 to 1).
        //Clamp the output colour to 0-1 range before conversion.
        //Convert from linear space to sRGB.
        //Write the colour to the image (scaling up by 255).

    //Perform progressive display if enabled.
});

And here's the results of the test runs, only 5 runs because I don't have that much time on hand...

Type Run #1 Run #2 Run #3 Run #4 Run #5 Average
Dry Run 95.5108 99.0494 98.1093 104.7610 103.1743 100.1210
2 Threads 73.9766 73.2554 72.8245 73.5884 71.8390 73.0968

And all that the rendering was doing was in two dedicated threads running the code with references to all of its variables. So perhaps some more optimization or changing how the rendering is going about in the first place.
The image above was two threads working simultaneously at rendering the image (with the image split in half down the middle), so thread1 is lagging behind thread2 because of the complexity in the area of the image highlighted below,

Tom Lynn

Read more posts by this author.

Australia http://rubbix.net