July 20, 2007 | By Seth Schoen

Harry Potter and the Digital Fingerprints

A few days before Friday's release of Harry Potter and the Deathly
, someone leaked a (genuine) copy of the book widely
using file-sharing networks and photo-sharing web sites -- photographing
every single page with a digital camera. The quality isn't
great -- the leaker evidently didn't have a nifty Internet Archive Scribe station -- but the text is legible.

Perhaps the leaker didn't realize that the digital camera he or she
used -- a Canon Rebel 300D -- left digital
fingerprints behind in every image
. We downloaded a copy of the
leak and took a look at the images with the open-source
one of dozens of programs capable of reading the industry-standard
EXIF digital photo
metadata format. As the press reported, the camera's serial number is
in there, along with over 100 other facts including the date and time
that the photos were taken and an assortment of photo-geek details about
focus and lighting conditions.

It may be, then, that the leaker can be traced; there are several ways
Canon might know who owns (or used to own) this camera, including a
possible warranty registration or service or repair on the camera. A retailer might also have kept relevant records when it originally sold the camera.
Another prospect: if images taken with the same camera were uploaded
to a photo-sharing site like
Flickr, their EXIF metadata might
associate use of that camera with a particular account. (Flickr and
other sites usually don't allow the public to search by EXIF tag values. But it's possible that Flickr itself, or a third-party spider that had downloaded all of its images, could perform such a search.)

Last year, we received a letter expressing surprise that many
digital cameras embed their serial
numbers (and other information)
into every photo they take. A large
number of photographers are apparently unaware of this possibility,
although it's not a secret and is described in some camera manuals
(as well as digital photography tutorials and other documentation).
It's also possible to remove (or change) the EXIF tag data using
photo-editing software. Camera manufacturers say that they add this data
for the convenience of photographers (for example, to help them keep
track of which cameras and settings they used to achieve particular
effects), not to enable spying and tracking. For example, a Kodak employee
told a concerned photographer:

Inclusion of serial number is a standard part of the EXIF data package and it's one of the ways the computer can identify one camera from another so if you own multiple units of the same model camera your computer can identify them separately. Obviously that's not something that most users would have but it does happen a lot in the business world - large insurance companies and real estate firms are a couple of examples.

(He added that users who were unhappy about EXIF data in their photographs could remove it if they chose.)

Some recent camera setups can
even use GPS to include ("geocode") information about the physical geographic
location where a photo was taken -- a boon to hobbyists, tourists, and
others, but an obvious privacy risk if future photographers somehow remain
unaware that this information is being embedded into their images.

Of course, digital cameras aren't the only devices that may keep a
record that could track a document back to its creator. We've
extensively discussed how
most color laser
invisibly embed the printer serial number and date and
time of printing on every page, in a pattern of tiny yellow dots.
Although customers have
been complaining
, printer manufacturers have so far refused to
let customers disable the tracking. (HP, for example, recently wrote
to update one customer that it was wrong to say initially that it was unable to
disable the tracking; instead, it now says it "will not" do so.)

Most computer users are unaware that CD burners in their PCs also
contain a similar tracking mechanism that embeds a unique serial
number, called a Recorder Identification Code, on every CD they burn.
(As far as we know, this mechanism has also been extended to DVD
burners.) This rule is enforced by
Philips via its patents on
the CD formats. The standards for the RID code are not directly
available to the public, but Philips writes:

As result of the discussion in March of 1995, between the consumer electronics manufacturers and the recording industry [...] it will be possible to trace each disc back to the exact machine on which it was made using coded information in the recording itself. [...] The RID coding system, which has been incorporated in the various Orange Books which contain the CD-R and CD-RW Standard Specifications, specifies a system which enables every CD recorder/rewriter to write its unique ID to every CD disc recorded by that CD recorder. [...] THE USE OF THE RID CODE IS MANDATORY.

So at a start, we have digital cameras, color laser printers, color
photocopiers, CD burners, and DVD burners all invisibly embedding their
unique serial numbers -- and much more -- into every document they
produce. And more and more devices are using a serial number of some
kind as an integral part of their communications. Network cards contain
unique, usually
persistent MAC addresses
that are seen by routers and other nearby
computers (and could track a computer from wireless access point to
wireless access point, or wired network to wired network). Cell phones contain -- and transmit -- a variety
of serial numbers, such as IMEI codes, allowing an individual
subscriber to be
tracked in real-time even while not making a telephone call. It seems that our devices
are increasingly acquiring unique digital fingerprints and are not
particularly shy about leaving these fingerprints behind in interactions.

And here we're talking only about digital fingerprints that were intentionally
built into devices -- whether in the name of improved functionality
or in response to secret requests from the government or the recording
industry -- and not about tracking mechanisms that merely result by accident
from the physical construction of devices (like
pattern noise
in digital cameras and scanners). Clearly, the digital
fingerprints devices leave behind are proliferating much faster than users
are being made aware of them.

