Dr. Lawlor's Code, Robots, & Things

June 24, 2018

Legit Websites that Track Everything You Do

Filed under: Mass Surveillance, Random Thoughts — Dr. Lawlor @ 9:44 pm

TL;DR Version: many legitimate websites record extremely detailed data about everything you do on the site, including your mouse position and where you scroll, using a tool like ForeSee Replay.  Since there is no global opt-out, you need to install an ad blocker to prevent this kind of bandwidth-wasting privacy intrusion.

This afternoon I opened a few eBay tabs, and I have a vague recollection of seeing one of those standard ForeSee popups asking if I’d like to participate in a survey.  I almost always say “No Thanks”, and I’m 99% sure I did not agree to a survey today.

This evening, I noticed my laptop’s outbound network traffic was heavy, which seriously slows down our rural DSL (4mbit down, 1mbit up).  The more I looked at this, the less I liked it: chrome was sending piles of HTTPS data off to four different AWS-hosted servers that list themselves as “ForeSee Record Status (cxReplayRecorder) v2.4.11” (IP addresses:, but they’re load balanced).  I grabbed some of the traffic with Wireshark, but it’s encrypted, so I still have no way of telling what exactly was sent.

I knew the traffic was coming from chrome, but I have a bad habit of keeping about a hundred tabs open (!), so I carefully watched the bandwidth usage as I incrementally closed tabs.  As soon as I closed the ebay tabs, the traffic stopped.  Reopening the tabs didn’t bring the traffic back–it only happens when the tracker code decides it has enough data to be worth sending back out.

ForeSee seems particularly evasive in the way it phrases the ‘survey’: not only will you (maybe) answer questions, the ForeSee® Replay code also tracks every mouse click, mouse *hover*, and scroll that you do on the site.  Their Replay viewer lets them see heat maps showing where people click, or even hover the mouse.  (This is useful data for building the site, but it’s not disclosed that it is part of the ‘survey’.)

The tracking servers dump a pile of metadata, from which I can see:

  • There are about 10 active replay recorder servers right now.
  • Each server is receiving about 1,000 ‘transmits’ per minute.
  • Each ‘transmit’ occupies about 100 kilobytes (on average), which is a lot of bandwidth, and a lot of information captured.
  • A typical server seems to capture over a terabyte per week, from over 10 million users.

To stop this, install an ad blocker like the free Ad Block Plus, and add “/foresee/*” to the filter list.


November 15, 2017

“for int” considered harmful

Filed under: C++11, Programming — Dr. Lawlor @ 2:26 pm

In C or C++, the traditional variable to use in a loop is “int i”, typically something like:

for (int i=0;i<length;i++)  ... do stuff with i ...

Using “int” is traditional, but is a really bad idea, because “int” is only 32 bits on all modern machines.  This means any loop that should do more than 2 binary billion iterations will fail (binary billion == 1,073,741,824 == 1<<20).

If the loop iteration count exceeds what “int” can store, one option is a crash as int wraps around to -2147483648:

long length=3*1000*1000*1000L;
std::vector<char> v(length);
size_t count=0;
for (int i=0;i<length;i++) {
	if (0 == i%(128*1024*1024)) std::cout<<"i="<<i<<"\n";
return count;

(Try this in NetRun now!)

On my 64-bit Linux machine, this produces no compile warnings and works fine for lengths below 2 binary billion.  It all runs in about one second, including the time for std::vector to allocate and zero initialize 3GB of RAM.  But at runtime everything works fine until i overflows and wraps around:

Caught signal SIGSEGV

When i wraps around to -2147483648, the program is actually trying to write to array element v[-2147483648].  So worse than crashing, this introduces a potential security hole if an attacker can arrange for this index to point to valid memory.

If the program does not access memory, this acts as an infinite loop, since the int i can never reach the loop target.

In the program above, if we replace “long length” with “size_t length”, we at least get a compiler warning about comparison between signed (int) and unsigned (size_t) types.  But now the loop silently stops before finishing the correct number of iterations:

Program complete.

Here, i has wrapped around, but the compiler casts it to unsigned before comparing it with length: the loop is essentially converted to “for (int i=0;(size_t)i<length;i++)”.  Since casting a negative int i to unsigned size_t results in a huge value, the loop breaks out after 2 binary billion iterations, leaving the rest of the array untouched.

Needless to say, this will result in a very confusing data corruption bug.

Typecasting the comparison to work the opposite way, “for (int i=0;i<(int)length;i++)”, eliminates the warning but results in the long length converting to a negative int, and the loop executes zero iterations.

Hence using “int” as the loop index can result in:

  • A crash and/or security hole
  • An infinite loop
  • Skipping the last loop iterations
  • Skipping the loop entirely

Instead you should:

  • Prefer C++11 range-based for loops, since they’re clean and reliable
    • for (char &element : myVector)
  • If you can’t use range-based for, use size_t as a loop index (on 64-bit machines, size_t is 64 bits)
    • for (size_t i=0;i<length;i++)

If you’re stuck with int, you can’t reliably process data with more than 2 binary billion entries.  In a world where phones can have 8GB of RAM, people regularly process files exceeding 2GB, and your CPU can sling multiple gigs around in under a second, this is not a good idea!

April 26, 2016

Terrain Rendering in 3D

Filed under: Graphics, Linux, Programming, Random Thoughts — Dr. Lawlor @ 4:49 pm

Back in 2003 I worked on a terrain simplification algorithm, a modification of Lindstrom’s method, to allow interactive exploration of big terrains in 3D.

A binary distribution is available for Windows (.zip) or Linux (.tar.gz), and the binaries should work, and let you view your own binary DEM and JPEG texture.  I made an attempt to include the source in there, although I built both of the binaries using my custom build system, so it’s likely to take quite a bit more work, and possibly even some missing custom libraries, to get it to compile.

September 15, 2015

Rescuing Data from a Banished Daemon

Filed under: Linux, Programming — Dr. Lawlor @ 9:12 pm

Last week we managed to accidentally delete the entire directory housing our custom database server (superstar), including the on-disk backups and historical backups of its robot info database.  Not good.

Amazingly, ps showed the server was still running, and poking around we could even see the in-memory copy of the database was still fine, but it was trapped inside this banished daemon.

# ps
no_priv 5149 0.1 0.0 14932 1640 ? S 18:04 0:08 ./superstar

Checking /proc/5149 shows the directory and exe have been deleted, which make a lot of tools break:

# ls -al /proc/5149
lrwxrwxrwx 1 no_priv no_priv 0 Sep 11 18:09 cwd -> /home/itest/git3/superstar (deleted)
-r-------- 1 no_priv no_priv 0 Sep 11 18:20 environ
lrwxrwxrwx 1 no_priv no_priv 0 Sep 11 18:09 exe -> /home/itest/git3/superstar/superstar (deleted)

We tried for a while to recreate or change the cwd, so we could tickle the daemon into backing up its database somewhere accessible–relinking won’t seem to change this directory, although possibly inode-level reconstruction of the old directory could have worked.

But I could still copy off the running executable for analysis:

# cp /proc/5149/exe /tmp/necrodaemon
# nm /tmp/necrodaemon | grep back
000000000061ca40 b _ZL15backup_filename

This is the in-memory address of the string holding the filename where the daemon tries to back up its in-memory database.  Currently it reads “db.bak”, a relative path, which is a problem since the process’s current working directory is gone.

Now just attach to the daemon in gdb, and we can see and change what’s inside it.

# gdb -p 5149

Of course, the daemon wasn’t compiled with debug symbols, so we can’t call methods or even functions.  So step one is verifying there’s anything actually at that “backup_filename” memory address–it’s a std::string, which has a single pointer to a separate refcounted object that actually holds the bytes of the string.  We then overwrote the bytes of this string to make the relative path into an absolute path, “/a.bak”, a file which we’d previously created and given the correct permissions.

(gdb) p *(long *)0x61ca40
$1 = 27717720
(gdb) p *(char **)0x61ca40
$7 = 0x1a6f058 "db.bak"
(gdb) set {char}0x1a6f058 = '/'
(gdb) p *(char **)0x61ca40
$8 = 0x1a6f058 "/b.bak"
(gdb) set {char}0x1a6f059 = 'a'
(gdb) p *(char **)0x61ca40
$9 = 0x1a6f058 "/a.bak"
(gdb) detach
Detaching from program: , process 5149
(gdb) quit

As soon as the backup ran, the daemon had written its in-memory database to the newly created file “/a.bak”, so our data was saved–daemon necromancy to the rescue!

August 12, 2015

Parrot “Rolling Spider” UAV Hacking: Dumping the Filesystem

Filed under: Hardware, Linux, Programming, Rolling Spider — Dr. Lawlor @ 11:53 pm

I just got a new tiny UAV, the Parrot “Rolling Spider” ($80), which is very fun to fly via bluetooth with my phone.  But it’s also a linux-based computer, so it’s also fun to hack!

The easiest way to get a root shell is to just plug it in via the USB cable, which not only shows up as a removable USB drive, it also shows up as a network device (at least, as of the 1.99.2 firmware version).  This means you can immediately get a root shell with:


That was easy!  Now to dump the filesystem, to netcat on port 1234.  (The ^p avoids /proc, which has infinite recursive root links; the ^y avoids /sys, which has files that change in size.)

tar cpf - [^p][^y]* | nc -l -p 1234

To get the filesystem as a file on your desktop computer, now just:

nc 1234 > rootfs.tar

This has *everything*, from:

drwxr-xr-x root/root 0 1969-12-31 14:00 bin/
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/getopt -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/dd -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/cp -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/df -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ip -> busybox
-rwxrwxr-x root/root 35 1969-12-31 14:00 bin/kk
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ln -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ls -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/mv -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ps -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/rm -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/sh -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/vi -> busybox
-rwxrwxr-x root/root 305 1969-12-31 14:00 bin/blink_led_orangeleft.sh
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ash -> busybox


drwxr-xr-x root/root 0 1969-12-31 14:00 usr/share/avahi/
-rw-r--r-- root/root 560 1969-12-31 14:00 usr/share/avahi/avahi-service.dtd
-rw-r--r-- root/root 5104 1969-12-31 14:00 usr/share/avahi/service-types
drwxr-xr-x root/root 0 1969-12-31 14:00 var/
lrwxrwxrwx root/root 0 1969-12-31 14:00 var/log -> /tmp/
-rw-rw-r-- root/root 7 1969-12-31 14:00 version.txt
drwxr-xr-x root/root 0 1969-12-31 14:00 www/
-rw-rw-r-- root/root 485 1969-12-31 14:00 www/index.html

For example, now I can see the contents of the control shell scripts:

$ cat bin/set_led_greenleft.sh 

# temp behaviour : red light right on
gpio 33 -d ho 1
# temp behaviour : red light left off
gpio 30 -d ho 0

#green light right off
gpio 31 -d ho 0

#green light left on
gpio 32 -d ho 1

I can also see the details of how the code was built:

$ file bin/busybox
busybox: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.16, stripped

Of course, eventually I’ll want to permanently modify this filesystem, by re-flashing the UAV with a reverse engineered PLF firmware file, which is similar to the Parrot AR Drone PLF format.  I’m nearly there with “plftool -e raw -i rollingspider_update.plf -o .”, but each resulting file has the filename prepended in some sort of fixed-length binary header.

Stay tuned!

March 16, 2014

Arduino millis() jumps by 2 every 43 milliseconds

Filed under: Arduino, Programming — Dr. Lawlor @ 1:07 am

I’ve been doing some software-modulated infrared light detection work, and for the signal to be sent correctly I need millisecond-accurate timings.  I realize professionals do this sort of thing with interrupts, but I hate debugging interrupt code, and I’m always afraid there’s some rare timing glitch that will kill the entire project at the worst possible moment.

I was using Arduino’s millis() function as my time base, and I kept getting weird screwy timing every 43 milliseconds.  I assumed it was my code, which does a bunch of other stuff like serial communication, but the glitch didn’t change regardless of what I did.  Turns out, the Arduino Uno’s oscillator runs at a power of two rate, 1.024 milliseconds per overflow, so every 1/0.024 = 41.666 milliseconds, it’s a full millisecond off.  Arduino wiring.c fixes this by keeping track of the fractional milliseconds, and adds a sort of “leap millisecond” every 43 milliseconds to keep things in sync.  This results in millis() instantly jumping up by more than one.

Here’s an example Arduino Uno sketch that demonstrates these timing jumps, and shows two fixes.

/* Demonstrate timing gaps in the millis() function */
void setup(void) {
 Serial.begin(57600); // to report the horror

extern "C" volatile unsigned long timer0_overflow_count; // from wiring.

unsigned long last=0; // last value of millis()
void loop(void) {
 unsigned long next;
 do { // watch the timer until it changes
   next=millis(); // jumps by 2 every 43 ms
   //Two possible fixes:
   // next=micros()/1000; // slow, but works
   // cli(); next=timer0_overflow_count; sei(); // faster, but 1024 microseconds/tick
 } while (next==last);

 int delta=next-last;
 if (delta!=1) {
   Serial.print(" jump at ");

 if (last%10000==0) {
   Serial.print("Still running at ");

On my Arduino Uno, the millis() version of this reports regular timing glitches:

Still running at 0
2 jump at 43
2 jump at 86
2 jump at 128
2 jump at 171
2 jump at 214
2 jump at 256
2 jump at 299
2 jump at 342

The easy way to avoid these timing jumps is never to use millis() at all!  It works to just use micros()/1000, although looking at wiring.c’s implementation of delay(), you need to be a bit careful about when micros() overflows every hour.  You can also directly read the raw timer0_overflow_count as shown above, although the actual speed this increments depends on the Arduino clock rate, and you need to do cli/sei around the read to avoid jumps of +-256, as the interrupt function updates the two bytes of that value.  Both fixes never have leap milliseconds, and result in dependable timing.

January 11, 2014

CPU Performance Counters with “likwid”

Filed under: C++11, Programming — Dr. Lawlor @ 12:01 am

I’m having fun playing with a little command line performance counter toolkit with the silly name of “likwid” which stands for “Like I know what I’m doing!”  If you haven’t used performance counters while programming, they can very quickly tell you surprising things about your code’s performance, like the fact that you’re getting clobbered by cache misses, not floating point arithmetic.

To get the toolkit running on a recent Ubuntu linux machine (tested on 13.10):


April 29, 2013

Chrome JavaScript InvalidStateError: DOM Exception 11

Filed under: Programming — Dr. Lawlor @ 2:35 pm

There’s an annoyingly cryptic error message in Chrome when your JavaScript code does an XMLHttpRequest incorrectly–“Uncaught Error: InvalidStateError: DOM Exception 11”.  This error shows up when you access the request’s “status” inside onreadystatechange prior to hitting readyState 4.  Firefox doesn’t complain about this, and Chrome only seems to care if you’re doing it inside onreadystatechange.

For example, this works fine:

var req=new XMLHttpRequest();
  if (req.readyState==4 && req.status==200)

But interchanging the order of the req.readyState and req.status checks results in “Uncaught Error: InvalidStateError: DOM Exception 11”.


December 30, 2012

C++11 Performance Analysis

Filed under: C++11, Programming — Dr. Lawlor @ 1:29 pm

My favorite new language is C++11, also known as C++Ox.  Contrary to popular belief, compiled languages aren’t dead, and in fact they’re more important than ever: the two big problems with the popular runtime typed, interpreted languages (JavaScript, Perl, Python, PHP, Ruby, Scheme, etc) are (1) speed and (2) robustness.  The speed thing has been getting better as people build dynamic translation engines, but the robustness problem is harder to fix.  In interpreted languages, developers don’t get compile errors; instead your *users* get runtime errors.  This is really the wrong time and place to find out you’ve passed two arguments to a function taking three arguments, or a string where you should have a number or object.

Anyway, C++11 has lots of cool new features, mostly aiming for more expressive code.  As a performance-oriented programmer, I always feel the need to understand the cross-platform delivered performance of any new feature before I use it very much.  So I worked my way through Alex Sinyakov’s excellent slideshow summarizing many of the new features, and built quite a few performance tests.  The platforms I tested are:

  • g++ 4.7.2, on Linux.  Generally, g++ had faster allocations, and has a little more of the language implemented.  Note that 4.7.0, from this March, is missing support for delegated constructors and literal operators, but these are supported in the most recent 4.7.2.
  • clang 3.3, on Linux. Generally this performed almost identically to g++, possibly since it’s using the same libstdc++ headers.  You want the latest version from SVN, since the 3.0 that ships with Ubuntu 12.10 doesn’t support initializer lists, seems to segfault on various simple lambdas, and doesn’t link with libstdc++.
  • Microsoft Visual Studio 2012, Release Mode, on Windows 8.  Generally, VS was better at automatically inlining functions.  But it’s missing initializer lists, delegated constructors, raw string literals, variadic templates, and a few other pieces.  Error messages for templated code are totally useless, not even giving you the types being instantiated.

I am really impressed at how much of C++11 is already working, and working the same way on these three modern compilers.  It’s also amazing how many cool language features have *zero* runtime performance cost: they’re inlined and optimized away at compile time.  The performance of most pieces of code seems to boil down to one question: is there memory allocation?  If so, the function is really slow, taking tens of nanoseconds on either platform.  If not, the function is fast, taking only a couple of nanoseconds.  This means std::string and std::vector are *much* slower to create than character buffers and arrays, although accessing them is the same speed.

  • auto, nullptr, <type_traits>, static_assert, and even <tuple> work on both platforms, and have zero runtime performance cost.  Range-based for is exactly as fast as a loop across integer indices.  for_each is exactly as fast as a loop between iterators.  Returning a tuple from an actual (non-inlined) function call is only a fraction of a nanosecond more expensive than returning a native type, and exactly the same cost as returning a custom struct.
  • std::function is weirdly slow to create (10-20ns), and costs 1-2 extra function call overheads (4-6ns) to execute.  This is despite lambdas and std::bind being entirely inlined and cost-free if you put the result into an “auto”, but slow if put into a std::function.
  • unique_ptr is nearly the same cost as a bare pointer (30-60ns), but much easier to make correct.  shared_ptr costs over twice as much (60-140ns) if you use new, probably because there’s a separate memory allocation for the reference count.  Chris Hartman pointed out “make_shared” not only reduces code duplication, it can merge the refcount allocation, and it is indeed almost as fast as unique_ptr.
  • <random> makes it cheap to create random number engines and distributions, and accessing them is about as fast (10-20ns) as calling the old rand.

I’ve only begun to benchmark the containers, but there are a few interesting results.  Ordered from slowest to fastest access times, reported as write time per integer for a new container with 100 integers:

  •  Bare [] arrays are the fastest of all, 0.3-0.4ns
  • <array> is pretty good, 0.4-0.5ns
  • <vector> is OK, 2.7ns if you’ve pre-reserved, 8-13ns if you just push_back and let it reallocate and copy.  In either case writes are far slower than any type of array, which I blame mostly on the dynamic memory allocation required by <vector> compared to the fixed-size stack allocation of <array>.
  • <deque> is decent on gcc at 4ns, slow on Visual at 30ns(!) per element.  I’ve heard gcc does block allocation to make deque faster.
  • <list> and <forward_list> are 30-90ns per element, and clearly do one allocation per element.
  • <map> and <unordered_map> are 60-130ns per element.  Currently <unordered_map> isn’t very compelling, at least for small maps up to a few thousand elements.  In my cursory tests, writes are about the same speed as the old <map>.  Unordered reads are 3x faster than <map> reads under gcc, but nearly 2x slower under Visual.

The fact that storing an integer to a <map> is 200-fold slower (20000% slower!) than storing the same integer to an array[] seems pretty surprising to me.  Too bad the STL allocator interface isn’t easy to improve. Maybe this is decade-old news to everybody else!  Note that reads aren’t nearly as bad, probably because there’s no allocation doing a read.

See the detailed numerical results and source code here.  All of these were run on my Ivy Bridge i7-3632QM laptop (quad core, 2.2-3.1Ghz), though spot tests on my Sandy Bridge i5-2400 desktop runs at roughly the same speed.

Blog at WordPress.com.