Earlier this month we wrote about potential malicious behavior in Adobe's e-reader software, “Digital Editions.” There were several independent reports claiming that Adobe's software was sending back to Adobe—in the clear—a list of books read in the software. There were also independent reports that the program was sending back lists of books on an attached e-reader, even if those books had never been opened in ADE itself—in other words, collecting information not just about the book you are reading now, but your electronic library.

On the other hand, not everyone was able to replicate the all of this behavior, so we decided to run our own tests. We were able to confirm that Adobe Digital Editions 4.0.0 was sending back metadata, including the title and pages read, about books read in the software. Even more troubling, the software was sending back information about books loaded onto certain attached e-readers – contrary to Adobe's claim that it collected information solely “for the eBook currently being read by the user and not for any other eBook in the user’s library or read/available in any other reader.”

To perform these tests we ran Wireshark, an open source program that records network traffic, allowing researchers to analyze it. With Wireshark running we opened Adobe Digital Editions and performed some tasks such as adding books to the library, reading books, and deleting books. On each start of the software it would send back metadata about the previous session such as titles of books, pages read, time spent reading and more.

An image of the wireshark packet capture

Data being sent to Adobe's servers. Including book title and pages read.

We were also able to reproduce the results of the experiment run by The Digital Reader. To perform these tests we again used Wireshark. We plugged a Sony Reader PRS-600 into a computer with ADE installed. When we started ADE with the reader plugged in, we observed ADE sending back data about what has been happening on the reader such as books added and deleted from the reader. Books which were never opened in Adobe Digital Editions.

We were also able to confirm that Adobe Digital Editions gets information from other e-readers that simply have Adobe software installed on them, such as the Sony Reader, Nook, and Boyue. Of course, there may be other readers that are also susceptible.

Last week, responding to criticism about these privacy violations, Adobe released a new version of their reader software. The changelog states that it has “Enhanced security for transmitting rights management and licensing validation information. With this latest version of Digital Editions 4.0.1, the data is sent to Adobe in a secure transmission (using HTTPS).”

We decided to run more tests to determine exactly what data—if any—Adobe was still collecting about reading habits. To perform these tests we used Fiddler. Fiddler is a local proxy that intercepts HTTPS traffic and allows you to decrypt it. It does this by performing a “man in the middle” attack, where it intercepts the traffic before it is encrypted, and encrypts it to a key that you control, allowing it to be decrypted.

With this test we were able to determine that Adobe is now encrypting the connections between ADE and Adobe servers. But more importantly, it appears that Adobe is no longer sending back metadata on what books you read. When we performed tests with the new version, the only time we saw data going back to an Adobe server was when an ebook with DRM was opened for the first time. This data is most likely being sent back for DRM verification purposes, and it is being sent over HTTPS. It even seems that Adobe has gone one step further and shut down plaintext HTTP access to their logging servers, so that even ADE 4.0 is no longer able to send back data about what books you are reading.

It appears the problem is solved, for now. So, what can we learn from this mess?

  1. If you make a mistake that violates your user's privacy, you must immediately and completely fix the problem. We applaud Adobe for taking action to fix the privacy problems in their Digital Editions software.
  2. Adobe has a lot more to do to restore reader trust. First, they developed and marketed a product that seriously compromised reader privacy. Second, when the flaw was exposed, they admitted one error (transmitting data in the clear) but continued to deny collecting information about reader libraries.
  3. We can't trust vendors to protect our privacy for us. We expect Adobe didn't deliberately set out to undermine our privacy – but it happened anyway, and could have continued indefinitely if the Digital Reader hadn't done a little investigating. Which leads to the final lesson:
  4. Doctorow's Law: Anytime someone puts a lock on something you own, against your wishes, and doesn't give you the key, they're not doing it for your benefit. ADE is not exactly a lock, but it collects a host of information about the reader in order to, among other things, “facilitate the implementation of different licensing models by publishers.” In other words, to assist sellers, not readers. So let us suggest a corollary to Doctorow's law: Anytime someones collect information about you, without your knowledge and against your wishes, they're not doing it for your benefit.