So I took it upon myself to implement a better multithreading system. So before it was rendering quadrants of the scene, but when a quadrant would finish the thread would idle until the other threads finish their rendering.
This implementation consists of a queue of tasks to complete for every thread, it consists of a list of lambda functions which each thread interacts with in this order...
- Thread gets the function and temporarily stores it
- Thread then removes the function from the list by removing the first element (since that's what it grabbed)
- Thread then sets itself to busy and executes the function
3.5 At this point, any thread can grab the first task in the list and do the above steps in its own context
- After it finishes the function, it checks for another task that needs completing, otherwise it then idles until another task is scheduled...
The class for this is called
ThreadPool which I took inspiration from and heavily based the code from the following StackOverflow question's answer, here. I got my head around how it works and went about splicing it into the rendering function, then implemented spatial partitioning with a 3x3 grid.
It's safe to say that the spatial partitioning doesn't contribute that much optimization to the program other than possibly slowing it down ever so slightly compared to without out.
Code can be found here on Github!
Below is are results of the runs I ran to somewhat benchmark the implementation(s):
|Type||Run #1||Run #2||Run #3||Run #4||Run #5||Average|
|3 Threads (FX-6300 @ 3.5)||38.863160||40.543365||38.157259||37.682247||37.394116||38.5280294|
|4 Threads (FX-6300 @ 3.5)||34.056747||31.281899||32.647582||32.290473||33.405545||32.7364492|
|5 Threads (FX-6300 @ 3.5)||29.689700||30.674843||27.929297||28.853019||30.538307||29.5370332|
|6 Threads (FX-6300 @ 3.5)||28.765418||26.459776||26.581819||26.335517||26.206820||26.86987|
|6 Threads (i7 - 4770 @ 3.4)*||14.189984||13.918125||13.928597||14.012235|
|7 Threads (i7 - 4770 @ 3.4)*||13.477395||13.493861||13.462990||13.478082|
|8 Threads (i7 - 4770 @ 3.4)*||13.345843||13.267355||13.215825||13.276341|
* includes spatial partitioning using a 3x3 grid.
Here's an example of the quadrants being rendered...
And here's failed example when I tried to specify the trace function with a different
width that referred to the quadrant itself...
Not quite what the result needed to be...
And here's also my friend using his i7 - 4770 quad core destroying the times of my FX CPU :(