Dr. Lawlor's Code, Robots, & Things

November 15, 2017

“for int” considered harmful

Filed under: C++11, Programming — Dr. Lawlor @ 2:26 pm

In C or C++, the traditional variable to use in a loop is “int i”, typically something like:

for (int i=0;i<length;i++)  ... do stuff with i ...

Using “int” is traditional, but is a really bad idea, because “int” is only 32 bits on all modern machines.  This means any loop that should do more than 2 binary billion iterations will fail (binary billion == 1,073,741,824 == 1<<20).

If the loop iteration count exceeds what “int” can store, one option is a crash as int wraps around to -2147483648:

long length=3*1000*1000*1000L;
std::vector<char> v(length);
size_t count=0;
for (int i=0;i<length;i++) {
	if (0 == i%(128*1024*1024)) std::cout<<"i="<<i<<"\n";
	v[i]=3;
	count++;
}
return count;

(Try this in NetRun now!)

On my 64-bit Linux machine, this produces no compile warnings and works fine for lengths below 2 binary billion.  It all runs in about one second, including the time for std::vector to allocate and zero initialize 3GB of RAM.  But at runtime everything works fine until i overflows and wraps around:

i=0
i=134217728
i=268435456
i=402653184
i=536870912
i=671088640
i=805306368
i=939524096
i=1073741824
i=1207959552
i=1342177280
i=1476395008
i=1610612736
i=1744830464
i=1879048192
i=2013265920
i=-2147483648
-------------------
Caught signal SIGSEGV

When i wraps around to -2147483648, the program is actually trying to write to array element v[-2147483648].  So worse than crashing, this introduces a potential security hole if an attacker can arrange for this index to point to valid memory.

If the program does not access memory, this acts as an infinite loop, since the int i can never reach the loop target.

In the program above, if we replace “long length” with “size_t length”, we at least get a compiler warning about comparison between signed (int) and unsigned (size_t) types.  But now the loop silently stops before finishing the correct number of iterations:

i=0
i=134217728
i=268435456
i=402653184
i=536870912
i=671088640
i=805306368
i=939524096
i=1073741824
i=1207959552
i=1342177280
i=1476395008
i=1610612736
i=1744830464
i=1879048192
i=2013265920
Program complete.

Here, i has wrapped around, but the compiler casts it to unsigned before comparing it with length: the loop is essentially converted to “for (int i=0;(size_t)i<length;i++)”.  Since casting a negative int i to unsigned size_t results in a huge value, the loop breaks out after 2 binary billion iterations, leaving the rest of the array untouched.

Needless to say, this will result in a very confusing data corruption bug.

Typecasting the comparison to work the opposite way, “for (int i=0;i<(int)length;i++)”, eliminates the warning but results in the long length converting to a negative int, and the loop executes zero iterations.

Hence using “int” as the loop index can result in:

  • A crash and/or security hole
  • An infinite loop
  • Skipping the last loop iterations
  • Skipping the loop entirely

Instead you should:

  • Prefer C++11 range-based for loops, since they’re clean and reliable
    • for (char &element : myVector)
  • If you can’t use range-based for, use size_t as a loop index (on 64-bit machines, size_t is 64 bits)
    • for (size_t i=0;i<length;i++)

If you’re stuck with int, you can’t reliably process data with more than 2 binary billion entries.  In a world where phones can have 8GB of RAM, people regularly process files exceeding 2GB, and your CPU can sling multiple gigs around in under a second, this is not a good idea!

Advertisements

November 8, 2017

Making chroot jails

Filed under: Linux, Sysadmin — Dr. Lawlor @ 8:20 pm

A chroot jail is a UNIX way run a dangerous application, like a network server, inside its own limited subset of the filesystem.  This cuts off access to recurring security holes like setuid executables in /bin or /usr/bin, kernel device files in places like /proc and /dev, and it lets you build your own restricted or sanitized runtime environment including libraries and config files.  Docker or rkt containers use chroot as one of the ways they sandbox containerized applications, and the fact all the libraries are included makes containers portable across systems.

A maximum security chroot jail consists of the prisoner process, and the bare minimum shared libraries necessary for it to run.  You can figure out which libraries are needed using “ldd ./prisoner”:

root@5a2c7cc2357f:/tmp/jailhouse# cp /bin/date ./prisoner
root@5a2c7cc2357f:/tmp/jailhouse# ldd ./prisoner
  linux-vdso.so.1 =>  (0x00007ffdf433d000)
  libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f39e8d18000)
  /lib64/ld-linux-x86-64.so.2 (0x000055751140c000)

(Or, since ldd is just a shell script wrapper around the dynamic linker library /lib64/ld-linux-x86-64.so.2, you can also run “/lib64/ld-linux-x86-64.so.2 –list ./prisoner”, although “ldd” is easier to remember.)

You can ignore linux-vdso, which is injected by the kernel, but if you copy the remaining shared libraries into the jail, you can run the prisoner process inside the jail using chroot.  Here, we just need ld-linux and libc:

root@5a2c7cc2357f:/tmp/jailhouse# mkdir lib64
root@5a2c7cc2357f:/tmp/jailhouse# cp /lib64/ld-linux-x86-64.so.2 lib64/
root@5a2c7cc2357f:/tmp/jailhouse# mkdir lib
root@5a2c7cc2357f:/tmp/jailhouse# mkdir lib/x86_64-linux-gnu
root@5a2c7cc2357f:/tmp/jailhouse# cp /lib/x86_64-linux-gnu/libc.so.6 lib/x86_64-linux-gnu/
root@5a2c7cc2357f:/tmp/jailhouse# chroot --userspec 12345:6789 . ./prisoner
Thu Nov 9 04:43:38 UTC 2017

Here we ran the prisoner as user ID 12345, in group ID 6789, due to the “–userspec 12345:6789” in the chroot call.  Without a userspec, the process runs in the jail as root, which is very bad (see chw00t)!  Neither this user nor group exist in my /etc/passwd or /etc/group files, which means the prisoner only gets a number, not a name.  You can add the username to either the system /etc/passwd or the chroot jailhouse/etc/passwd to get symbolic names, although do remember to set the permissions on jailhouse/etc/passwd carefully, because overwriting passwd with “jailbird:x:0:0::/home/jailbird:/bin/bash” could make the jailbird user run as root within the jail.

Often the process you want to run inside the jail is more complex, and so it needs quite a few libraries to work.  Sometimes those libraries load more files and libraries at runtime, and they’re not always well documented.  In these cases, “strace ./prisoner” is quite useful, because it shows you every system call, including calls like open (finding config files), exec (calling sub-programs), or mmap (usually loading shared libraries or allocating memory).  I often copy strace into the chroot so I can “chroot . strace ./prisoner” and watch what the program is doing inside the jail.

A particularly complex program may need /dev entries created (CUDA in a chroot needed basically all of /dev/nvidia*), or things like /proc mounted into the chroot using “mount -o bind /proc proc”.  These sorts of comforts make the jail more like a halfway house, in that it allows the prisoners more functionality but does pose more of a danger to society.

I’ve built chroot jails packed with ancient library versions so I can run ancient programs in their old environment.

I’ve also built chroot jails to contain student code (e.g., in NetRun).

Further reading:

Blog at WordPress.com.