Month: May 2015

Pressurized Abstractions

Speed Read This
Posted by on May 31, 2015

Leaky abstraction is a term coined by Joel Spolsky to represent a pattern in software engineering. We often want to use abstraction to hide the details of something, and replace it with something easier to work with. For example, it’s hard to build an application on top of “sending packets over the internet that might get lost or arrive out of order or get corrupted in transit”. So we have TCP, which is an abstraction layer that lets us pretend we have a reliable byte stream instead. It protects us from the details of packet loss by automatically retransmitting. It protects us from the issue of packet corruption with checksums. It protects us from thinking about packets at all, by automatically splitting our byte stream into appropriately sized packets and splitting and combining them when needed. But a total loss of connectivity will break the abstraction; our messages might not arrive, and we can’t always tell whether a message that was sent was actually received. If we care about timing, that will break the abstraction; a packet loss and retransmission will cause a hiccup in latency. If we care about security, that will break the abstraction; we can’t always be sure messages are going to or coming from who we think they. So we say that this abstraction is leaky.

What’s missing from this picture, however, is what happens when an abstraction fails. In some cases, extra facets of the layer below the abstraction become extra facets of the layer above the abstraction. For example, a database abstracts over files on a disk; if you run out of disk space, you won’t be able to add more rows to your tables. This is natural and intuitive and if a database server with no disk space starts refusing writes, we don’t blame the database software.

But sometimes, especially when problems arise, abstractions don’t fail gracefully or intuitively. They interfere with debugging and understanding. They take issues that should be simple, and make them hard. Rather than leak, they burst. I’ve come to think of these as pressurized abstractions.

Take build systems, for example, which abstract over running programs and giving them command-line arguments. These are notoriously finicky and difficult to debug. Why? All they’re really doing is running a few command-line tools and passing them arguments. The problem is that, in the course of trying to work around those tools’ quirks and differences, they separate programmers from what’s really going on. Consider what happens when you run Apache Ant (one of the more popular Java build systems), and something goes wrong. Investigating the problem will lead into a mess of interconnected XML files, which reference other XML files hidden in unfamiliar places in framework directories. It’s easy to regress to trying to fix with guess-and-check. The solution, in this case, is to temporarily remove the abstraction: find out what commands are really being executed (by passing –verbose) and debug those. This is a safety valve; the abstraction is temporarily bypassed, so that you can debug in terms of the underlying interface.

Pressurized abstractions are everywhere. Some common examples are:

  • Package managers (apt-get, rpm, etc), which abstract over files in the filesystem.
  • Object/relational mapping libraries (Hibernate), which abstract over database schemas.
  • RPC protocol libraries (dbus, SOAP, COM)

Ruby on Rails is sometimes criticized for having “too much magic”. What’s meant by that is that Ruby on Rails contains many pressurized abstractions and few safety valves.

An intuition for which abstractions will burst is one of the major skills of programming. Generally speaking, pressurized abstractions should be used sparingly, and when they must be used, it’s vital to understand what’s underneath them and to find and use the safety valves. The key to making these sorts of abstractions is to make them as transparent as possible, by writing good documentation, having good logging, and putting information in conspicuous places rather than hiding it.

Shovel-Ready AGI Safety Work for Programmers

Speed Read This
Posted by on May 29, 2015

The AI Containment (or Boxing) problem is, given an artificial general intelligence or a prospective artificial general intelligence, how do you keep it from influencing the world? Containment is vital to safety, because without it, other safety strategies fail to work. If you have an AGI that you’re still testing, you need to keep it securely contained, or else when you find a dangerous bug, it will be too late. If you have a tripwire that’s supposed to warn you if the AGI is planning something dangerous, you need to keep it securely contained, or else it will disable the tripwire or intercept the warning. If you have a multi-part system where your two AGIs have to agree before they proceed with a plan, you have to keep them contained, or else one of the AGIs may decide to remove the other’s veto.

Others have observed that AGI containment isn’t a sufficient safety measure, because an AGI could convince its gatekeepers to let it out. While this is true, what it means is that containment is necessary, but not sufficient. And containment has one major advantage over most other subproblems of AGI safety: we know what needs to be done.

I recently applied for a grant to work on the AGI containment problem. If granted, I’ll be working on the AGI containment problem full time starting in September, with two other collaborators. In the mean time, I’ll be spending as much of my spare time on it as I can on it, because despite being massively important, to the best of my knowledge, the three of us are the only people in the world working on the problem. In the process of writing that grant application, I spent some time exploring the space of possible containment strategies and technologies, and discovered a massive amount of important work to be done.

There should be more people working on this. If you have programming skill, you can get started right now. Here, I present some software projects that need doing.

  1. Write a Linux kernel module for a tripwired network interface: a device which looks like an inactive network interface from userspace, but rather than being an actual interface, activates a tripwire/alarm when activated.
  2. Automated seccomp configuration: Write a tool that takes output from strace, which is a list of syscalls, and uses it to generate a BPF filter for Linux’s seccomp mode which allows those syscalls and no others.
  3. Fuzz test the Torch7 library interfaces for buffer overflows.
  4. Write a library that uses the Intel CPU performance counters to detect suspicious instruction mixes (eg RowHammer).
  5. GPU API filtering. Look through a GPGPU library such as CUDA, identify a relatively-safe subset which preserves most of the important functionality, and write a wrapper which runs in a separate process and provides only that subset via an IPC interface.
  6. Work on anti-steganography techniques for images, audio and video. Ie, make a tool such that if an AGI draws a picture, the gatekeeper run it through the tool and publish the picture and be confident that no one will find a hidden message inside it later.

If you’re convinced that AGI safety is important but have had trouble finding an affordance to actually work on it, hopefully this will help you find a project. These are things that I am not planning to do myself, because I already have a long list of things I’m doing that are this good or better. If no one has jumped in here to say they’re doing it, it probably isn’t getting done.