Thursday, December 27, 2012

Playing with oprofile on Linux

I just spent some time using oprofile on Linux. oprofile allows basically to profile everything running on your system with a rather low overhead.
Lots of details here: http://oprofile.sourceforge.net/about/

A quick overview:

1. make oprofile use your kernel (root). Ignore it if you do not care about kernel symbols
$ opcontrol --vmlinux=/usr/src/linux-3.2.13-1-ARCH/vmlinux


2. make oprofile measure time spent in libraries (root)
$ opcontrol --separate=lib

3. start oprofile (root) 
$ opcontrol --start

4. measure time  spent in functions for "cube_client" :-)
$ opreport --demangle=smart  --symbols ~/src/cube/src/cube_client


5. You get this:
CPU: AMD64 family12h, speed 1497.22 MHz (estimated)
Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
68078004 72.8798  fglrx_dri.so             /usr/lib/dri/fglrx_dri.so
3984600   4.2657  cube_client              world::render_seg_new(float, float, float, int, int, int, int, int)
3060858   3.2768  cube_client              world::isoccluded(float, float, float, float, float)
2838442   3.0386  cube_client              rdr::render_flat(int, int, int, int, int, sqr*, sqr*, sqr*, sqr*, bool)
2696379   2.8866  libc-2.15.so             __mcount_internal
1777893   1.9033  cube_client              world::render_wall(sqr*, sqr*, int, int, int, int, int, sqr*, sqr*, bool)
1664943   1.7824  libc-2.15.so             mcount
1450401   1.5527  libm-2.15.so             /lib/libm-2.15.so
794027    0.8500  libc-2.15.so             _wordcopy_fwd_aligned
787522    0.8431  cube_client              world::computeraytable(float, float)
687461    0.7360  cube_client              rdr::render_square(int, float, float, float, float, int, int, int, int, int, sqr*, sqr*, bool)
669011    0.7162  cube_client              rdr::ogl::lookuptex(int, int&, int&)
640268    0.6854  fglrx-libGL.so.1.2       /usr/lib/fglrx/fglrx-libGL.so.1.2
603660    0.6462  cube_client              rdr::render_flatdelta(int, int, int, int, float, float, float, float, sqr*, sqr*, sqr*, sqr*, bool)
486056    0.5203  cube_client              rdr::ogl::drawframe(int, int, float)
441795    0.4730  cube_client              rdr::ogl::addstrip(int, int, int)
164852    0.1765  libc-2.15.so             __memmove_sse2
160559    0.1719  cube_client              _ZN7physics7collideEP6dynentbff.constprop.6
 

.... 

You will find lot of information on the net like how to capture other perf counters. Look at:
$ opcontrol --list-events