Tahoe and Tor: Building Privacy on Strong Foundations
Many people want to build secure Internet services that protect their users against surveillance, or the illegal seizure of their data. When EFF is asked how to build these tools, our advice is: don't start from scratch. Find a public, respected, project which provides the privacy-protecting quality you want in your own work, and find a way to implement your dream atop these existing contributions.
So, for instance, the New Yorker's Strongbox, a dropbox for anonymous sources, uses Tor as its basis to provide anonymity to its users. If you want anonymity in your app, building your tool on top of Tor's backbone means you can take advantage of its experience and future improvements, as well as letting you contribute back to the wider community.
Anonymity is only one part of what will make the Net secure and privacy-friendly, though. The recent NSA revelations as well as glitches and attacks on single services like GitHub, Amazon, Twitter and the New York Times, have prompted demand for online data storage that doesn't depend on companies who might hand over such data or compromise security to comply with government demands, nor depend on one centralised service that could taken down through external pressure.
The Principle of Least Authority In computer science, the principle of least authority means granting the minimum set of permissions necessary to accomplish a task. For example, someone who is a contributing blogger on a website doesn't need full administrator access to a site. Tahoe attempts to apply this principle to online file storage by ensuring through encryption that the organization storing your data can't see all your data, and that users can be given fine-grained access through cryptographic capabilities.
Cryptographic capabilities In Tahoe-LAFS, you can read or write a file in the system only if you know a (rather long) set of characters, or key. The capability keys are different for each file, which means you can share a picture by sending a friend one capability key without giving them access to everything. You can also give people power to create or even edit files by sending them different keys. Using capability-based security means there's no central authority that manages access control for you, as with Dropbox or Google Docs. You're in charge of spreading (or withholding) your capability keys.
Erasure coding A method of redundantly storing data over a number of servers that allows data to be reconstructed, even if a certain number of those servers get shut down or corrupted. In the default Tahoe network, data is spread over ten drives, and can be read even if seven of those servers are lost. That means you don't have to rely on one provider, and makes Tahoe storage harder to disrupt.
The Tahoe Least Authority File System (Tahoe-LAFS) has been actively developed since 2007. Just as Tor concentrates on anonymity, Tahoe-LAFS's developers have worked hard to create a resilient, decentralized, infrastructure that lets you store online both data you'd want to keep private, as well as data you want to share with selected groups of friends. It's also able to protect against a single source of failure or censorship, like a commercial service being attacked or responding to a takedown.
Tahoe-LAFS is open source, but this month, some of the Tahoe project's founders launched S4, a commercial "PRISM-proof" secure, off-site backup service which uses Tahoe as a backend, and Amazon as a storage site.
Tahoe's protections against third-party snooping and deletion have the kind of strong mathematical guarantees that reassure security experts that Tahoe-LAFS is well-defended against certain kinds of attack. That also means its privacy and resilience are not dependent on the good behaviour or policies of its operators (see the box for more info).
Secure online backups like S4 are one possible use for Tahoe's time-tested code and approach. You and your friends can run your own Tahoe network, sharing storage space across a number of servers, confident that your friends can only see and change what they have the caps to see, and that even if a sizeable number of those servers disappear, your data will still be retrievable. Services like git-annex-assistant, a decentralised Dropbox-like folder synchroniser, already optionally offer it as backend. Some privacy activists have run private Tahoe networks over Tor, creating an anonymous, distributed, and largely censorship-proof, storage system.
It's great to see commercial services like S4 emerging in the face of our new knowledge about pervasive online surveillance. Even better is the possibility that others, including entrepreneurs, designers and usability experts, will stand on the shoulders of the secure possibilities that protocols like Tor and Tahoe provide, and give us all innovative Internet tools that can truly keep users and their data safe and sound.