Leaky abstraction is a term coined by Joel Spolsky to represent a pattern in software engineering. We often want to use abstraction to hide the details of something, and replace it with something easier to work with. For example, it’s hard to build an application on top of “sending packets over the internet that might get lost or arrive out of order or get corrupted in transit”. So we have TCP, which is an abstraction layer that lets us pretend we have a reliable byte stream instead. It protects us from the details of packet loss by automatically retransmitting. It protects us from the issue of packet corruption with checksums. It protects us from thinking about packets at all, by automatically splitting our byte stream into appropriately sized packets and splitting and combining them when needed. But a total loss of connectivity will break the abstraction; our messages might not arrive, and we can’t always tell whether a message that was sent was actually received. If we care about timing, that will break the abstraction; a packet loss and retransmission will cause a hiccup in latency. If we care about security, that will break the abstraction; we can’t always be sure messages are going to or coming from who we think they. So we say that this abstraction is leaky.
What’s missing from this picture, however, is what happens when an abstraction fails. In some cases, extra facets of the layer below the abstraction become extra facets of the layer above the abstraction. For example, a database abstracts over files on a disk; if you run out of disk space, you won’t be able to add more rows to your tables. This is natural and intuitive and if a database server with no disk space starts refusing writes, we don’t blame the database software.
But sometimes, especially when problems arise, abstractions don’t fail gracefully or intuitively. They interfere with debugging and understanding. They take issues that should be simple, and make them hard. Rather than leak, they burst. I’ve come to think of these as pressurized abstractions.
Take build systems, for example, which abstract over running programs and giving them command-line arguments. These are notoriously finicky and difficult to debug. Why? All they’re really doing is running a few command-line tools and passing them arguments. The problem is that, in the course of trying to work around those tools’ quirks and differences, they separate programmers from what’s really going on. Consider what happens when you run Apache Ant (one of the more popular Java build systems), and something goes wrong. Investigating the problem will lead into a mess of interconnected XML files, which reference other XML files hidden in unfamiliar places in framework directories. It’s easy to regress to trying to fix with guess-and-check. The solution, in this case, is to temporarily remove the abstraction: find out what commands are really being executed (by passing –verbose) and debug those. This is a safety valve; the abstraction is temporarily bypassed, so that you can debug in terms of the underlying interface.
Pressurized abstractions are everywhere. Some common examples are:
- Package managers (apt-get, rpm, etc), which abstract over files in the filesystem.
- Object/relational mapping libraries (Hibernate), which abstract over database schemas.
- RPC protocol libraries (dbus, SOAP, COM)
Ruby on Rails is sometimes criticized for having “too much magic”. What’s meant by that is that Ruby on Rails contains many pressurized abstractions and few safety valves.
An intuition for which abstractions will burst is one of the major skills of programming. Generally speaking, pressurized abstractions should be used sparingly, and when they must be used, it’s vital to understand what’s underneath them and to find and use the safety valves. The key to making these sorts of abstractions is to make them as transparent as possible, by writing good documentation, having good logging, and putting information in conspicuous places rather than hiding it.