Dr. Lawlor's Code, Robots, & Things

January 11, 2014

CPU Performance Counters with “likwid”

Filed under: C++11, Programming — Dr. Lawlor @ 12:01 am

I’m having fun playing with a little command line performance counter toolkit with the silly name of “likwid” which stands for “Like I know what I’m doing!”  If you haven’t used performance counters while programming, they can very quickly tell you surprising things about your code’s performance, like the fact that you’re getting clobbered by cache misses, not floating point arithmetic.

To get the toolkit running on a recent Ubuntu linux machine (tested on 13.10):

sudo apt-get install make mercurial gcc
hg clone https://code.google.com/p/likwid/
cd likwid
sudo make install
sudo setcap cap_sys_rawio+ep /usr/local/bin/likwid-perfctr
sudo setcap cap_sys_rawio+ep /usr/local/bin/likwid-features
sudo setcap cap_sys_rawio+ep /usr/local/bin/likwid-powermeter

The setcap business is to allow access to the machine-specific registers like /dev/cpu/0/msr, which are how performance counters are read.   To run, you need (1) the capability bit above, (2) the tool to run as root, and (3) the loaded standard kernel module that allows access to them:

sudo modprobe msr

OK, now you’re ready to do some benchmarking!  For example, likwid-powermeter will dump the energy usage of any command line program–for me, this /bin/ls command consumed 0.04 joules:

sudo likwid-powermeter -M 0 /bin/ls

You can get more detail using likwid-perfctr, here dumping the MEM group of performance counters on the first two cores:

sudo likwid-perfctr -M 0  -C 0-1  -g MEM  /bin/ls

Some pinning magic is performed to make core 0 do your single-threaded work.  Use “-C 0-7” if you have 8 threads.

You can dump the whole list of performance counter groups with:

sudo likwid-perfctr -M 0 -a

I get:

Available groups on Intel Core IvyBridge processor:
BRANCH: Branch prediction miss rate/ratio
DATA: Load to store ratio
ENERGY: Power and Energy consumption
FLOPS_AVX: Packed AVX MFlops/s
FLOPS_DP: Double Precision MFlops/s
FLOPS_SP: Single Precision MFlops/s
L2: L2 cache bandwidth in MBytes/s
L2CACHE: L2 cache miss rate/ratio
L3: L3 cache bandwidth in MBytes/s
MEM: Main memory bandwidth in MBytes/s
MEM_DP: Power and Energy consumption
MEM_SP: Power and Energy consumption
TLB: TLB miss rate/ratio

Typically, you want to look at BRANCH and MEM.  Some of them, like FLOPS_AVX, are unlikely to be of general use.

If you need to benchmark one specific part of your program, there is a library interface too, but I like benchmarking everything–this ensures I’m not assuming the bottleneck location is in the wrong place and skipping the real problem, and the command line wrapper makes it easy to use with any program.  Try it!

Debugging help:

ERROR - [./src/accessClient.c:103] No such file or directory
Failed to execute the daemon '/usr/local/bin/likwid-accessD' (see error above)

This means it’s trying to access the daemon (which I could never get to work) instead of reading the /dev/cpu/0/msr file directly.  Be sure to pass the “-M 0” flag to use direct access.

rdmsr: failed to open '/dev/cpu/0/msr': Permission denied!

You need to run the likwid tool as root, or else chmod that device file so you can read/write it.

ERROR - [./src/msr.c:206] "cpu 8 reg 38d"

There are several possible causes for a bad MSR write: maybe you don’t have the likwid tool “setcap cap_sys_rawio+ep”, as listed in the installation instructions above (use “getcap /usr/local/bin/likwid*” to verify the capability is there). Or maybe you passed -C more cores than the machine actually has.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: