Sunday, December 18, 2011

Sauerbraten: a game and an engine that rock

Quick post about a fantastic game and engine: Sauerbraten (you may find it here)
If you are tired by design pattern BS, static type introspection with SFINAE joke, over-design of everything just to print a string... Sauerbraten is for you.

This is one of the code bases that recently impress me the most. Just straightforward and fantastic code. Everything is packed in 70,000 lines of code. Very nice custom (and really fast) UDP network code (ENet), insanely small scripting engine, straight-to-point rendering code, very cool AI, superb UI (3d is even supported), simple but powerful shader system...

Look at the code really and you will see what brutal and brilliant coding style means.

Monday, December 5, 2011

Global variables and cvar in tasking systems...

Usually it is just better to avoid global variables but there is something I really want in my small video game engine is a-la-quake console variables (cvar). They are handy and you can tweak your system in a very easy way.

Problem if you have a fully distributed system changing a cvar from the console is just terrible. If you do some frame-overlapping (for example), you may completely screw your system (saying you decrease / increase the maximum number of particles on screen...). Well, cvars simply become nasty race conditions.

Initially, I wanted to do something complicated ie some copy-on-write stuff. You have a fully reference counted copy of your cvars and when you modify them you copy them into a new reference counted state. Then, instead of reading your variables from the global state, you fetch them from your copy. As long as the copy is valid (ie no modification is done), you can keep it. So it is not that expensive.

Problem is that it is just uber-complicated and your cvar are not global anymore and therefore not that handy. Too bad. The idea I finally decided to implement (not done yet) is to lock the tasking system when you change a global cvar. Basically, you stop all the other threads: you simply need to make them sleep after they run one given task and once everyone except the locking thread is sleeping, you modify the variable.

So it is something like:
TaskingSystemLock();
cvar = ...
TaskingSystemUnlock();

It is brutal and super expensive, but you don't care since this is really rare anyway. From the other thread perspectives (the ones that go to sleep), you absolutely know that nothing is going to happen while you run a task since the lock does not cross the task boundary.

So, for the usual path (ie you just read the cvar), you do not need any mutual exclusion since you know nothing is going to happen inside the task itself.

It is going to take some time to have a real life test case in point-frag but the on-going implementation in yaTS seems to be straightforward.

EDIT: Note also that several threads can request a lock simultaneously. This is not a problem and this is also properly serialized (like any other lock). Only thing is that all variables that may be modified globally have to be reloaded after the lock.

Friday, December 2, 2011

Quick and dirty compiler tests

With point-frag, I am trying to support as many compilers and as many platform (well, Windows, Linux and soon Mac...) as I can. I hope to be able to do a benchmark from it at some point. I anyway quickly compared GCC, ICC and Clang on Linux with my small ray tracing test.
Here the numbers for my nehalem i7 (4 cores / 8 threads, triple channel memory, basically the high end chip 3 years ago). Code is compiled for SSE2 and does not use anything more recent.

GCC (4.6.2)
ray packet: 90 million rays/s
single rays: 9,9 million rays.s

ICC (12.1.0)
ray packet: 95.6 million rays/s
single rays: 10.6 million rays.s

CLANG (3.1 compiled from svn)
ray packet: 94.2 million rays/s
single rays: 9.9 million rays.s

With no surprise, the Intel compiler is slightly faster on a Intel processor than the other compilers. More surprisingly, the speed difference is quite small. Two years ago, GCC / ICC difference was about 15% on the same machine. Latest clang also outputs some fast code.

Good job open source compiler guys!