I made several benchmarks to stress the tasking system (yaTS) I recently developed and to see if it handles continuation / completion correctly.
Two tests actually spawn a binary tree of tasks.
You can see them in utests.cpp (CascadeNodeTask and NodeTask)
There is only one difference between both:
- NodeTask. Here each node completes the root. This means that when a task just finishes to run, it decrements an atomic counter in the root task. When this counter becomes zero, the root is done
- CascadeNodeTask. Here each node completes its parent. This basically means that the tasks finish in a cascade way (This is the classical and efficient way to do with work-stealing approach where the tasks are processed in depth-first order)
This leads to interesting results on my i7 machine (4 cores / 8 threads)
- NodeTask
1 thread == 237 ms
8 threads == 213 ms
Speed up == x1.1 - CascadeNodeTask
1 thread == 237 ms
8 threads == 54 ms
Speed up == x4.4 (> 4 => Congratulations hyper-threading!)
(EDIT, also do not see that as a performance "study". This is just a random but interesting performance difference I saw while writing functional tests for yaTS code)

2 comments:
Cool。any chance to test this code in amd 6 core cpu?
Take the code from here:
http://code.google.com/p/yats/
Post a Comment