Dr. Lawlor's Code, Robots, & Things

December 30, 2012

C++11 Performance Analysis

Filed under: C++11, Programming — Dr. Lawlor @ 1:29 pm

My favorite new language is C++11, also known as C++Ox.  Contrary to popular belief, compiled languages aren’t dead, and in fact they’re more important than ever: the two big problems with the popular runtime typed, interpreted languages (JavaScript, Perl, Python, PHP, Ruby, Scheme, etc) are (1) speed and (2) robustness.  The speed thing has been getting better as people build dynamic translation engines, but the robustness problem is harder to fix.  In interpreted languages, developers don’t get compile errors; instead your *users* get runtime errors.  This is really the wrong time and place to find out you’ve passed two arguments to a function taking three arguments, or a string where you should have a number or object.

Anyway, C++11 has lots of cool new features, mostly aiming for more expressive code.  As a performance-oriented programmer, I always feel the need to understand the cross-platform delivered performance of any new feature before I use it very much.  So I worked my way through Alex Sinyakov’s excellent slideshow summarizing many of the new features, and built quite a few performance tests.  The platforms I tested are:

  • g++ 4.7.2, on Linux.  Generally, g++ had faster allocations, and has a little more of the language implemented.  Note that 4.7.0, from this March, is missing support for delegated constructors and literal operators, but these are supported in the most recent 4.7.2.
  • clang 3.3, on Linux. Generally this performed almost identically to g++, possibly since it’s using the same libstdc++ headers.  You want the latest version from SVN, since the 3.0 that ships with Ubuntu 12.10 doesn’t support initializer lists, seems to segfault on various simple lambdas, and doesn’t link with libstdc++.
  • Microsoft Visual Studio 2012, Release Mode, on Windows 8.  Generally, VS was better at automatically inlining functions.  But it’s missing initializer lists, delegated constructors, raw string literals, variadic templates, and a few other pieces.  Error messages for templated code are totally useless, not even giving you the types being instantiated.

I am really impressed at how much of C++11 is already working, and working the same way on these three modern compilers.  It’s also amazing how many cool language features have *zero* runtime performance cost: they’re inlined and optimized away at compile time.  The performance of most pieces of code seems to boil down to one question: is there memory allocation?  If so, the function is really slow, taking tens of nanoseconds on either platform.  If not, the function is fast, taking only a couple of nanoseconds.  This means std::string and std::vector are *much* slower to create than character buffers and arrays, although accessing them is the same speed.

  • auto, nullptr, <type_traits>, static_assert, and even <tuple> work on both platforms, and have zero runtime performance cost.  Range-based for is exactly as fast as a loop across integer indices.  for_each is exactly as fast as a loop between iterators.  Returning a tuple from an actual (non-inlined) function call is only a fraction of a nanosecond more expensive than returning a native type, and exactly the same cost as returning a custom struct.
  • std::function is weirdly slow to create (10-20ns), and costs 1-2 extra function call overheads (4-6ns) to execute.  This is despite lambdas and std::bind being entirely inlined and cost-free if you put the result into an “auto”, but slow if put into a std::function.
  • unique_ptr is nearly the same cost as a bare pointer (30-60ns), but much easier to make correct.  shared_ptr costs over twice as much (60-140ns) if you use new, probably because there’s a separate memory allocation for the reference count.  Chris Hartman pointed out “make_shared” not only reduces code duplication, it can merge the refcount allocation, and it is indeed almost as fast as unique_ptr.
  • <random> makes it cheap to create random number engines and distributions, and accessing them is about as fast (10-20ns) as calling the old rand.

I’ve only begun to benchmark the containers, but there are a few interesting results.  Ordered from slowest to fastest access times, reported as write time per integer for a new container with 100 integers:

  •  Bare [] arrays are the fastest of all, 0.3-0.4ns
  • <array> is pretty good, 0.4-0.5ns
  • <vector> is OK, 2.7ns if you’ve pre-reserved, 8-13ns if you just push_back and let it reallocate and copy.  In either case writes are far slower than any type of array, which I blame mostly on the dynamic memory allocation required by <vector> compared to the fixed-size stack allocation of <array>.
  • <deque> is decent on gcc at 4ns, slow on Visual at 30ns(!) per element.  I’ve heard gcc does block allocation to make deque faster.
  • <list> and <forward_list> are 30-90ns per element, and clearly do one allocation per element.
  • <map> and <unordered_map> are 60-130ns per element.  Currently <unordered_map> isn’t very compelling, at least for small maps up to a few thousand elements.  In my cursory tests, writes are about the same speed as the old <map>.  Unordered reads are 3x faster than <map> reads under gcc, but nearly 2x slower under Visual.

The fact that storing an integer to a <map> is 200-fold slower (20000% slower!) than storing the same integer to an array[] seems pretty surprising to me.  Too bad the STL allocator interface isn’t easy to improve. Maybe this is decade-old news to everybody else!  Note that reads aren’t nearly as bad, probably because there’s no allocation doing a read.

See the detailed numerical results and source code here.  All of these were run on my Ivy Bridge i7-3632QM laptop (quad core, 2.2-3.1Ghz), though spot tests on my Sandy Bridge i5-2400 desktop runs at roughly the same speed.


December 19, 2012

Microsoft “Window” 8: cringe-worthy, mostly

Filed under: Random Thoughts — Dr. Lawlor @ 5:57 am

My brand new laptop arrived today, and I’m blogging to you live from inside Windows 8.  Granted, I’m mostly a Linux user, but my favorite distro Ubuntu recently torched their traditional desktop UI in favor of a more touch-and-tablet friendly system named Unity, which I mostly hate, so I’m willing to experiment.  There are bright spots in 8, but overall I can’t say I’m very impressed.

In Windows 8, the regular desktop lives inside a walled-off ghetto, where the few surviving windows (plural) scrape out a meager existence, with the usual title bars and task bars and such, except the start menu is gone.  Outside the ghetto, such as when you drag into the lower left corner of the screen, you get the new “Start Screen”, a tablet style UI where apps live in fullscreen freedom,  but more “Window” than Windows.

The desktop and tablet sides share basically nothing: the background patterns are different, the fonts are way different, they even each have their own separate control panels.  To switch between windows in the desktop, you use the taskbar at the bottom of the screen; to switch between apps in the tablet side, drag into the upper left corner and then down.  There’s zero relationship, either visual or semantic, between these two methods of switching.  Yet alt-tab cycles between *everything*, tablet or desktop, so why not the task bar?

To switch from tablet to desktop mode, you drag into the upper left corner, or click a “Desktop” link in the start screen, but grandma?  She’s now lost, because there are two totally unrelated modes (desktop and tablet) colliding here.  Most modern virtual machine systems are much better integrated than the desktop and tablet modes of Windows 8: I can mix and match individual Windows 7 VM and native Linux applications on my Ubuntu taskbar, but Microsoft can’t even share the taskbar with *themselves*.

The whole point of the app side is to get the new UI styles, and the start screen does have “just start typing” search, which is what all the cool kids are doing nowadays (e.g., Unity dashboard).  You start typing, it starts spitting out suggestions.  When you see what you want, you click it or hit enter.  This is a good idea, and Windows-cmd-enter instantly brings up a DOS prompt as it should.  Yet typing “mouse”, “keyboard”, or “network” shows zero hits unless… you click the “Settings” category instead of the default “Apps”.  And you still get two totally separate groups of hits, one for apps, the other for the desktop ghetto.

Windows 8 start screen network search

And seriously, categories?  Clicking?  The ENTIRE POINT of typing is to do away with categories, so you don’t have to hunt through a dozen levels of menu to find what you need. You can read the now-fired Steven Sinofsky describe the Start menu search design here.  Does this actually sound simpler?  Than anything?

Steve Jobs, were he only alive, would have nuked that entire UI department from orbit, just to be sure.  Linux, here I come again…

December 16, 2012


Filed under: Random Thoughts — Dr. Lawlor @ 8:15 pm

This revealing little bit of machine readable text was appended onto the electronic response to my customer service query for my Chase credit card:


The fact that there are dedicated database fields to indicate “VIP” and “HIGH VALUE” customers tweaks my innate populist tendencies.  Maybe it’s just that I’m officially marked as a not very important and low value customer.

This immediately makes me wonder who gets their “VIP INDICATOR” set to “Y”?  Certainly folks like Bono or the CEO, but what about the CEO’s wife?  Or his girlfriend?

And how much special treatment do you get with “VIP INDICATOR”?  Do they auto-route your customer service calls to Kansas instead of Mumbai?  Give prompt, useful responses instead of the usual machine-generated crap like “You are very important to us, which is why it took us three days to respond.”  If they auctioned off VIP INDICATOR, how much would it be worthwhile to pay?

December 15, 2012

Cat in a box

Filed under: Random Thoughts — Dr. Lawlor @ 9:01 am

Why do cats love cardboard boxes so much?  Is it because every warehouse also contains mice?


December 13, 2012

Search and Modern Problem-Solving

Filed under: Random Thoughts — Dr. Lawlor @ 6:02 pm

I’m seeing this answer to my final exam questions of the form “How would you do XYZ?”:

“I’d google XYZ, and do what the first result tells me.”

It’s a silly answer, but it’s actually a legitimate problem-solving algorithm.  I’m not sure I can bring myself to mark it as entirely wrong!

December 9, 2012

Printrbot Assembly: 2012 Howto

Filed under: 3D Printing, Printrbot — Dr. Lawlor @ 8:34 am

Last month I built up a Printrbot Plus kit, which was a lot of fun and seems to be printing well.  But both the hardware and software for this do-it-yourself 3D printing technology is still pretty immature, so here’s my guide to the current rev of that particular hardware.

The definitive place to start is Tim Stark’s official assembly instructions, a solid photo guide.  It leaves out a few problems with the current kits, though, including:

  1. At step 55, my Y motor stepper’s shaft was too short to reach the tiny setscrew.  I drilled and tapped the plastic pulley for a new lower #6-32 set screw, which still just barely reaches the motor shaft.
    Printrbot Y setscrew
  2. At step 68, while attaching the heated bed to the carriage, you need to insert *something* at least 2-3mm high underneath, or you’ll find the carriage hits the deck before the printhead even gets close to the bed.  I put an aluminum sheet under the bed to even out the heat distribution, which is pretty nonuniform by default.  My under-bed sheet is a little over 2mm thick, cut in a “U” shape to clear the thermistor.  I used the long M3 screws instead of drilling out the PCB for #6-32, but either would work.
    Printrbot bed
  3. At step 104, the 5/16″ black hex head bolt needs the whole stack listed at step 136.  If you wait any longer, you won’t be able to use a hex key to keep the bolt from rotating as you tighten the nylon nut on top.
  4. At step 142, my X carriage belt pulley was just a little too tall to clear the zip ties holding the Z linear bearings.  I sanded and filed the pulley down, and mounted it *very* close to the stepper surface.
    Printrbot X pulley
  5. Many people, including me, got badly crooked gears for the extruder at step 160.  I couldn’t even fit the hex head of the hobbed bolt inside, and cracked the brittle castable material while trying to chisel some clearance.  It’s a chicken-and-egg problem if this is your only printer, but if you can get something working you can print new extruder gears.

Optional improvements:

  1. I dunked all the wood parts in urethane, which makes them look better, keeps screws from backing out, and reduces dimensional changes when humidity varies.
  2. I wanted to protect the extruder and bed wiring, since flexing back and forth repeatedly across sharp zipties will eventually cause wiring faults.  So I spiral-wound grip tape around all the exposed wiring, and ran a little steel wire out from the extruder to guide the extruder wiring into a gentle curve in *front* of the carriage. The sharp corner and stretched wires of step 168 make me wince.
  3. My Y axis belt stretched after installation, resulting in blobby prints.  I tensioned it using a rubber band hooked on a binder clip, pulling sideways to take up belt tension.  Keep it far enough back that it won’t get sucked into the stepper even when the bed is fully forward.

Lots of folks seem to get discouraged when things don’t work straight out of the box, but this is brand-new technology: you’ll need some tools, talent, patience, and creativity.  But the Printrbot is a solid kit, and with a few tweaks makes a reliable printer!

Coating a Printrbot 3D printer in Polyurethane

Filed under: 3D Printing, Printrbot — Tags: , , — Dr. Lawlor @ 3:35 am

One whole generation of do-it-yourself 3D printers (from 2009’s Makerbot Cupcake to 2012’s Printrbot Plus) are made from laser-cut wood, held together with small screws and bolts.  There are good things about this: wood is cheap, light, stiff, and environmentally friendly, and laser cutting is fast and precise.  But wood is fairly weak and splinter-prone, wood structures warp with humidity changes, and the bolts holding everything together tend to vibrate loose when printing.

You can improve each of these drawbacks by dunking your wooden printer assemblies into polyurethane.  Dunking lets the polyurethane soak into the fibers, which strengthens them somewhat.  It reduces the rate moisture can diffuse in, reducing warping with humidity.  And polyurethane soaked onto the threads of the screws keeps them from vibrating loose, but is still removable for servicing.

Last month I built up a Printrbot Plus from a kit, and tried coating the wood.  I dunked each of the major assemblies (the base, bridge, extruder, and printbed) after adding screws and nuts, but before adding any electronics or linear bearings.  Here are the parts ready to go in: assemblies are on the right; loose parts on the left.


Here they are after coating in polyurethane.  I had to brush the poly onto the top deck, since it’s too big to fit in the 1 gallon can, but everything else got dunked on both ends, and then brushed into the middle.  This process was pretty messy, so I wore rubber gloves.


This “quick drying” poly still took about a day to stop being tacky and smelly; the solvent stench is so strong that I wore an activated carbon respirator while dunking and painting.  Total poly consumption was tiny, 50 mL or less, although you need a much bigger container to allow dunking. There doesn’t seem to be any substantial dimensional change, and everything assembled fine.  The poly brought out some beautiful chatoyancy glinting within the wood, an unexpected benefit, and added a nice warm glow to the wood.

This printer has been run hard for several dozen prints and a few cumulative days of continuous printing since then, and haven’t had a screw back out yet!

December 6, 2012


Filed under: Random Thoughts — Dr. Lawlor @ 3:49 am

I’ve decided to stop fighting the future and just blog about my experiences in programming, building and configuring robots, CNC machining, and 3D printing.  I’m Dr. Orion Lawlor, an associate professor of computer science at the University of Alaska Fairbanks Computer Science department.

See my main homepage for my publications, courses, and software.  Our high school robotics and cyber-physical systems outreach project CyberAlaska has its own blog, as does our LunaBotics automated mining robot project LunAlaska.

Blog at WordPress.com.