Dr. Lawlor's Code, Robots, & Things

November 15, 2017

“for int” considered harmful

Filed under: C++11, Programming — Dr. Lawlor @ 2:26 pm

In C or C++, the traditional variable to use in a loop is “int i”, typically something like:

for (int i=0;i<length;i++)  ... do stuff with i ...

Using “int” is traditional, but is a really bad idea, because “int” is only 32 bits on all modern machines.  This means any loop that should do more than 2 binary billion iterations will fail (binary billion == 1,073,741,824 == 1<<20).

If the loop iteration count exceeds what “int” can store, one option is a crash as int wraps around to -2147483648:

long length=3*1000*1000*1000L;
std::vector<char> v(length);
size_t count=0;
for (int i=0;i<length;i++) {
	if (0 == i%(128*1024*1024)) std::cout<<"i="<<i<<"\n";
	v[i]=3;
	count++;
}
return count;

(Try this in NetRun now!)

On my 64-bit Linux machine, this produces no compile warnings and works fine for lengths below 2 binary billion.  It all runs in about one second, including the time for std::vector to allocate and zero initialize 3GB of RAM.  But at runtime everything works fine until i overflows and wraps around:

i=0
i=134217728
i=268435456
i=402653184
i=536870912
i=671088640
i=805306368
i=939524096
i=1073741824
i=1207959552
i=1342177280
i=1476395008
i=1610612736
i=1744830464
i=1879048192
i=2013265920
i=-2147483648
-------------------
Caught signal SIGSEGV

When i wraps around to -2147483648, the program is actually trying to write to array element v[-2147483648].  So worse than crashing, this introduces a potential security hole if an attacker can arrange for this index to point to valid memory.

If the program does not access memory, this acts as an infinite loop, since the int i can never reach the loop target.

In the program above, if we replace “long length” with “size_t length”, we at least get a compiler warning about comparison between signed (int) and unsigned (size_t) types.  But now the loop silently stops before finishing the correct number of iterations:

i=0
i=134217728
i=268435456
i=402653184
i=536870912
i=671088640
i=805306368
i=939524096
i=1073741824
i=1207959552
i=1342177280
i=1476395008
i=1610612736
i=1744830464
i=1879048192
i=2013265920
Program complete.

Here, i has wrapped around, but the compiler casts it to unsigned before comparing it with length: the loop is essentially converted to “for (int i=0;(size_t)i<length;i++)”.  Since casting a negative int i to unsigned size_t results in a huge value, the loop breaks out after 2 binary billion iterations, leaving the rest of the array untouched.

Needless to say, this will result in a very confusing data corruption bug.

Typecasting the comparison to work the opposite way, “for (int i=0;i<(int)length;i++)”, eliminates the warning but results in the long length converting to a negative int, and the loop executes zero iterations.

Hence using “int” as the loop index can result in:

  • A crash and/or security hole
  • An infinite loop
  • Skipping the last loop iterations
  • Skipping the loop entirely

Instead you should:

  • Prefer C++11 range-based for loops, since they’re clean and reliable
    • for (char &element : myVector)
  • If you can’t use range-based for, use size_t as a loop index (on 64-bit machines, size_t is 64 bits)
    • for (size_t i=0;i<length;i++)

If you’re stuck with int, you can’t reliably process data with more than 2 binary billion entries.  In a world where phones can have 8GB of RAM, people regularly process files exceeding 2GB, and your CPU can sling multiple gigs around in under a second, this is not a good idea!

Advertisements

November 8, 2017

Making chroot jails

Filed under: Linux, Sysadmin — Dr. Lawlor @ 8:20 pm

A chroot jail is a UNIX way run a dangerous application, like a network server, inside its own limited subset of the filesystem.  This cuts off access to recurring security holes like setuid executables in /bin or /usr/bin, kernel device files in places like /proc and /dev, and it lets you build your own restricted or sanitized runtime environment including libraries and config files.  Docker or rkt containers use chroot as one of the ways they sandbox containerized applications, and the fact all the libraries are included makes containers portable across systems.

A maximum security chroot jail consists of the prisoner process, and the bare minimum shared libraries necessary for it to run.  You can figure out which libraries are needed using “ldd ./prisoner”:

root@5a2c7cc2357f:/tmp/jailhouse# cp /bin/date ./prisoner
root@5a2c7cc2357f:/tmp/jailhouse# ldd ./prisoner
  linux-vdso.so.1 =>  (0x00007ffdf433d000)
  libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f39e8d18000)
  /lib64/ld-linux-x86-64.so.2 (0x000055751140c000)

(Or, since ldd is just a shell script wrapper around the dynamic linker library /lib64/ld-linux-x86-64.so.2, you can also run “/lib64/ld-linux-x86-64.so.2 –list ./prisoner”, although “ldd” is easier to remember.)

You can ignore linux-vdso, which is injected by the kernel, but if you copy the remaining shared libraries into the jail, you can run the prisoner process inside the jail using chroot.  Here, we just need ld-linux and libc:

root@5a2c7cc2357f:/tmp/jailhouse# mkdir lib64
root@5a2c7cc2357f:/tmp/jailhouse# cp /lib64/ld-linux-x86-64.so.2 lib64/
root@5a2c7cc2357f:/tmp/jailhouse# mkdir lib
root@5a2c7cc2357f:/tmp/jailhouse# mkdir lib/x86_64-linux-gnu
root@5a2c7cc2357f:/tmp/jailhouse# cp /lib/x86_64-linux-gnu/libc.so.6 lib/x86_64-linux-gnu/
root@5a2c7cc2357f:/tmp/jailhouse# chroot --userspec 12345:6789 . ./prisoner
Thu Nov 9 04:43:38 UTC 2017

Here we ran the prisoner as user ID 12345, in group ID 6789, due to the “–userspec 12345:6789” in the chroot call.  Without a userspec, the process runs in the jail as root, which is very bad (see chw00t)!  Neither this user nor group exist in my /etc/passwd or /etc/group files, which means the prisoner only gets a number, not a name.  You can add the username to either the system /etc/passwd or the chroot jailhouse/etc/passwd to get symbolic names, although do remember to set the permissions on jailhouse/etc/passwd carefully, because overwriting passwd with “jailbird:x:0:0::/home/jailbird:/bin/bash” could makes the jailbird user run as root within the jail.

Often the process you want to run inside the jail is more complex, and so it needs quite a few libraries to work.  Sometimes those libraries load more files and libraries at runtime, and they’re not always well documented.  In these cases, “strace ./prisoner” is quite useful, because it shows you every system call, including calls like open (finding config files), exec (calling sub-programs), or mmap (usually loading shared libraries or allocating memory).  I often copy strace into the chroot so I can “chroot . strace ./prisoner” and watch what the program is doing inside the jail.

A particularly complex program may need /dev entries created (CUDA in a chroot needed basically all of /dev/nvidia*), or things like /proc mounted into the chroot using “mount -o bind /proc proc”.  These sorts of comforts make the jail more like a halfway house, in that it allows the prisoners more functionality but does pose more of a danger to society.

I’ve built chroot jails packed with ancient library versions so I can run ancient programs in their old environment.

I’ve also built chroot jails to contain student code (e.g., in NetRun).

Further reading:

March 9, 2017

Fairbanks PM2.5 Air Quality Data

Filed under: Air Quality — Dr. Lawlor @ 1:49 am

Tomorrow night the borough assembly will consider Ordinance 2017-18 (page 70 in the agenda) which strengthens the current air quality prohibitions.  It’s been two years since the prohibition ordinance first passed, so I thought it was a good time to look at the AKDEC data on whether the air is now cleaner.

The biggest factor contributing to air quality here is the exterior temperature–when it’s cold, not only are people burning more home heating fuel and leaving their cars running, there’s usually a temperature inversion that traps that pollution near the ground.  So these charts use the exterior temperature as the horizontal axis; the R^2 coefficient below shows that temperature explains about half the variance in PM2.5.

Here are the last 4 years of PM2.5 data for Fairbanks.  The blue data is from the two winters before the ordinance; the orange data is from the two winters after the ordinance took effect.

fairbanks_PM2.5_matched

Here’s the same chart for North Pole, which has much bigger air quality problems, and has had frequent air quality alerts (burn bans) in effect the past few winters.

north_pole_PM2.5_matched

I personally dislike the entire prohibit, tax, and fine approach that the current air quality ordinance takes–the long term solution is cheap natural gas in Fairbanks; the short term or summertime solution is indoor HEPA filters, they’re cheap and they work.

But to me the data clearly show a 15-25% reduction in PM2.5 in the post-ordinance years, even after controlling for the weather effect.  That’s a much bigger reduction than I’d expected!

February 20, 2017

Materials Breaking: Collection of Videos

Filed under: Random Thoughts — Dr. Lawlor @ 3:06 pm

High magnification (electron microscope?) image of lathe tools turning various steels.  Steel’s deformation is ductile, like clay.

Ultra high speed footage of breaking glass, showing a brittle fracture that propagates at about half the speed of sound in glass (kilometers per second).

Even higher speed footage (500ns between frames) of plexiglass cracking:

Tensile stretching steel rebar to failure.  There is a lot of plastic deformation before the final break.

Without steel reinforcing, concrete fails in a brittle fashion, with cracks opening up in the areas in tension.

At supersonic velocity, a crash test dummy can obliterate a cinder block wall.

 

January 23, 2017

Sandstruder: metering sand for 3D printing

Filed under: Random Thoughts — Dr. Lawlor @ 10:59 am

For the NASA 3D printed habitat challenge, I’m upgrading my 3D printer to be able to shuffle around fluffy basalt dust.  For that, I need a sort of extruder capable of moving sand around.  My weekend project was a “sandstruder”, designed to extract sand from a hopper, and send it down a (to be determined) sort of sand Bowden tube.

model_of_sandstruder

OpenSCAD model

cimg7040

3D printed in two halves

cimg7047

Extruding sand!

Sand extrusion seems quite reliable, although metering is not very consistent.  Wired up as an “extruder”, and programmed like a Wade’s geared extruder, a software commanded “10mm of filament” emits between 8 and 14 grams of dust, with an average around 11 grams.  It seems to run reliably up to the speed limit of the stepper, a few thousand mm per minute (so a few kilograms of sand per minute).

If the auger bit gets jammed up on a rock, the stepper just skips steps rather than destroying itself, which is one advantage of this low-torque direct drive setup.  I will need to pre-screen the sand to eliminate the rocks, however.

Like it?  Download an STL or the OpenSCAD source code on thingiverse (or github)!

April 26, 2016

Terrain Rendering in 3D

Filed under: Graphics, Linux, Programming, Random Thoughts — Dr. Lawlor @ 4:49 pm

Back in 2003 I worked on a terrain simplification algorithm, a modification of Lindstrom’s method, to allow interactive exploration of big terrains in 3D.

A binary distribution is available for Windows (.zip) or Linux (.tar.gz), and the binaries should work, and let you view your own binary DEM and JPEG texture.  I made an attempt to include the source in there, although I built both of the binaries using my custom build system, so it’s likely to take quite a bit more work, and possibly even some missing custom libraries, to get it to compile.

November 15, 2015

Interactive Web-Based Visualization Tools

Filed under: Random Thoughts — Dr. Lawlor @ 11:05 pm

Many datasets can be most easily viewed in 3D:

  • ArmsGlobe displays small arms and ammunition shipments through time and space.
  • ViziCities loads OpenStreetMap data into 3D, in browser.

A variety of physical simulations can use WebGL for interactive visualization.  One thing I love about WebGL is your pixel and vertex shaders run on the GPU with exactly the same performance you’d get in a full application, but it’s delivered instantly on any platform in a browser.

Several modern interactive 3D computer-aided design (CAD) programs are shifting to web-based tools:

  • OpenJSCAD provides a 3D constructive solid geometry programming language in-browser using JavaScript for computational geometry, and WebGL for rendering.
  • Onshape provides high end 3D computer aided design features similar to SolidWorks, but runs entirely in browser.

We’ve been building a web-based robot configuration, programming, and visualization system called RobotMoose.

Short link here for D2D workshophttp://tinyurl.com/d2dwebviz

September 28, 2015

Earthquake P-wave and nighttime anxiety

Filed under: Random Thoughts — Dr. Lawlor @ 9:24 pm

About 2am last night, I woke up feeling extremely anxious, and the hairs on my arms were standing up–neither of which is at all typical for me!  A few seconds later, I felt the bed start shaking very gently, and I realized it was probably an earthquake.

Checking the USGS Earthquake Map, there was indeed a magnitude 3.1 earthquake in our hills at this time, about 60km away and 11km deep.  Clearly something about the earthquake woke me up, but it’s surprising I subconsciously managed to detect the P wave while sound asleep, considering the later S wave (typically 2-3 times larger) could barely be felt.

September 15, 2015

Rescuing Data from a Banished Daemon

Filed under: Linux, Programming — Dr. Lawlor @ 9:12 pm

Last week we managed to accidentally delete the entire directory housing our custom database server (superstar), including the on-disk backups and historical backups of its robot info database.  Not good.

Amazingly, ps showed the server was still running, and poking around we could even see the in-memory copy of the database was still fine, but it was trapped inside this banished daemon.

# ps
...
no_priv 5149 0.1 0.0 14932 1640 ? S 18:04 0:08 ./superstar

Checking /proc/5149 shows the directory and exe have been deleted, which make a lot of tools break:

# ls -al /proc/5149
...
lrwxrwxrwx 1 no_priv no_priv 0 Sep 11 18:09 cwd -> /home/itest/git3/superstar (deleted)
-r-------- 1 no_priv no_priv 0 Sep 11 18:20 environ
lrwxrwxrwx 1 no_priv no_priv 0 Sep 11 18:09 exe -> /home/itest/git3/superstar/superstar (deleted)

We tried for a while to recreate or change the cwd, so we could tickle the daemon into backing up its database somewhere accessible–relinking won’t seem to change this directory, although possibly inode-level reconstruction of the old directory could have worked.

But I could still copy off the running executable for analysis:

# cp /proc/5149/exe /tmp/necrodaemon
# nm /tmp/necrodaemon | grep back
000000000061ca40 b _ZL15backup_filename

This is the in-memory address of the string holding the filename where the daemon tries to back up its in-memory database.  Currently it reads “db.bak”, a relative path, which is a problem since the process’s current working directory is gone.

Now just attach to the daemon in gdb, and we can see and change what’s inside it.

# gdb -p 5149

Of course, the daemon wasn’t compiled with debug symbols, so we can’t call methods or even functions.  So step one is verifying there’s anything actually at that “backup_filename” memory address–it’s a std::string, which has a single pointer to a separate refcounted object that actually holds the bytes of the string.  We then overwrote the bytes of this string to make the relative path into an absolute path, “/a.bak”, a file which we’d previously created and given the correct permissions.

(gdb) p *(long *)0x61ca40
$1 = 27717720
(gdb) p *(char **)0x61ca40
$7 = 0x1a6f058 "db.bak"
(gdb) set {char}0x1a6f058 = '/'
(gdb) p *(char **)0x61ca40
$8 = 0x1a6f058 "/b.bak"
(gdb) set {char}0x1a6f059 = 'a'
(gdb) p *(char **)0x61ca40
$9 = 0x1a6f058 "/a.bak"
(gdb) detach
Detaching from program: , process 5149
(gdb) quit

As soon as the backup ran, the daemon had written its in-memory database to the newly created file “/a.bak”, so our data was saved–daemon necromancy to the rescue!

August 12, 2015

Parrot “Rolling Spider” UAV Hacking: Dumping the Filesystem

Filed under: Hardware, Linux, Programming, Rolling Spider — Dr. Lawlor @ 11:53 pm

I just got a new tiny UAV, the Parrot “Rolling Spider” ($80), which is very fun to fly via bluetooth with my phone.  But it’s also a linux-based computer, so it’s also fun to hack!

The easiest way to get a root shell is to just plug it in via the USB cable, which not only shows up as a removable USB drive, it also shows up as a network device (at least, as of the 1.99.2 firmware version).  This means you can immediately get a root shell with:

telnet 192.168.2.1

That was easy!  Now to dump the filesystem, to netcat on port 1234.  (The ^p avoids /proc, which has infinite recursive root links; the ^y avoids /sys, which has files that change in size.)

tar cpf - [^p][^y]* | nc -l -p 1234

To get the filesystem as a file on your desktop computer, now just:

nc 192.168.2.1 1234 > rootfs.tar

This has *everything*, from:

drwxr-xr-x root/root 0 1969-12-31 14:00 bin/
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/getopt -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/dd -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/cp -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/df -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ip -> busybox
-rwxrwxr-x root/root 35 1969-12-31 14:00 bin/kk
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ln -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ls -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/mv -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ps -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/rm -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/sh -> busybox
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/vi -> busybox
-rwxrwxr-x root/root 305 1969-12-31 14:00 bin/blink_led_orangeleft.sh
lrwxrwxrwx root/root 0 1969-12-31 14:00 bin/ash -> busybox

through:

drwxr-xr-x root/root 0 1969-12-31 14:00 usr/share/avahi/
-rw-r--r-- root/root 560 1969-12-31 14:00 usr/share/avahi/avahi-service.dtd
-rw-r--r-- root/root 5104 1969-12-31 14:00 usr/share/avahi/service-types
drwxr-xr-x root/root 0 1969-12-31 14:00 var/
lrwxrwxrwx root/root 0 1969-12-31 14:00 var/log -> /tmp/
-rw-rw-r-- root/root 7 1969-12-31 14:00 version.txt
drwxr-xr-x root/root 0 1969-12-31 14:00 www/
-rw-rw-r-- root/root 485 1969-12-31 14:00 www/index.html

For example, now I can see the contents of the control shell scripts:

$ cat bin/set_led_greenleft.sh 
#!/bin/sh


# temp behaviour : red light right on
gpio 33 -d ho 1
# temp behaviour : red light left off
gpio 30 -d ho 0

#green light right off
gpio 31 -d ho 0

#green light left on
gpio 32 -d ho 1

I can also see the details of how the code was built:

$ file bin/busybox
busybox: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.16, stripped

Of course, eventually I’ll want to permanently modify this filesystem, by re-flashing the UAV with a reverse engineered PLF firmware file, which is similar to the Parrot AR Drone PLF format.  I’m nearly there with “plftool -e raw -i rollingspider_update.plf -o .”, but each resulting file has the filename prepended in some sort of fixed-length binary header.

Stay tuned!

Older Posts »

Create a free website or blog at WordPress.com.