Electronic Frontier Foundation
EFF is the leading civil liberties group defending
your rights in the digital world. » About Us
  • Home
  • About
  • Our Work
  • Deeplinks Blog
  • Press Room
  • Take Action
  • Shop
Home » Deeplinks Blog » September, 2009
  • Deeplinks Archives
    • October, 2011
    • September, 2011
    • August, 2011
    • July, 2011
    • June, 2011
    • May, 2011
    • April, 2011
    • March, 2011
    • More Archives
  • Blog Categories
    • Accessibility for the Reading Disabled
    • Analog Hole
    • Announcement
    • Anonymity
    • Anti-Counterfeiting Trade Agreement
    • Bloggers' Rights
    • Broadcast Flag
    • Broadcasting Treaty
    • CALEA
    • Call To Action
    • CDA 230
    • Cell Tracking
    • Coders' Rights Project
    • Commentary
    • Copyright Trolls
    • Council of Europe
    • Criminalizing Content
    • Development Agenda
    • Digital Books
    • Digital Radio
    • Digital Rights Management
    • Digital Video
    • DMCA
    • DMCA Rulemaking
    • Do Not Track
    • E-Voting Rights
    • EFF Europe
    • EFF15
    • File Sharing
    • FOIA Litigation for Accountable Government
    • Free Speech
    • Free Trade Agreement of the Americas
    • Hollywood v. DVD
    • Innovation
    • Intellectual Property
    • International
    • International Privacy Standards
    • Legal Analysis
    • Legislative Analysis
    • Locational Privacy
    • Mandatory Data Retention
    • miniLinks
    • National Security Letters
    • Net Neutrality
    • News Roundup
    • News Update
    • No Downtime for Free Speech Campaign
    • NSA Spying
    • Online Behavioral Tracking
    • Patents
    • PATRIOT Act
    • Pen Trap
    • Printers
    • Privacy
    • Real ID
    • RFID
    • Search Engines
    • Search Incident to Arrest
    • Security
    • Social Networks
    • Technical Analysis
    • Terms Of (Ab)Use
    • Test Your ISP
    • The COICA Internet Censorship and Copyright Bill
    • The Global Network Initiative
    • Transparency
    • Travel Screening
    • Trusted Computing
    • Video Games
    • Wikileaks
    • WIPO
September 21st, 2009
Email This Digg This Post this to Reddit Share this blog post with delicious Share this on Facebook Tweet this blog post Dent this blog post

How Online Tracking Companies Know Most of What You Do Online (and What Social Networks Are Doing to Help Them)

Technical Analysis by Peter Eckersley
This post is Part 2 of a series on user tracking on the web today. You can read Part 1 here and Part 3 here.

3rd party advertising and tracking firms are ubiquitous on the modern web. When you visit a webpage, there's a good chance that it contains tiny images or invisible JavaScript that exists for the sole purpose of tracking and recording your browsing habits. This sort of tracking is performed by many dozens of different firms. In this post, we're going to look at how this tracking occurs, and how it is being combined with data from accounts on social networking sites to build extensive, identified profiles of your online activity.

How 3rd parties get to see what you do on the web.

Let's start with an example of 3rd party tracking: when we went to CareerBuilder.com, which is the largest online jobs site in the United States, and searched for a job, CareerBuilder included JavaScript code from 10 (!) different tracking domains: Rubicon Project, AdSonar, Advertising.com, Tacoda.net (all three are divisions of AOL advertising), Quantcast, Pulse 360, Undertone, AdBureau (part of Microsoft Advertising), Traffic Marketplace, and DoubleClick (which is owned by Google). On other visits we've also seen CareerBuilder include tracking scripts and non-JavaScript web bugs from several other domains. There are pretty sound reasons to hope that when you search for a job online, that fact isn't broadcast to dozens of companies you've never heard of — but that's precisely what's happening here.

Ten 3rd party tracking sites' content is included in CareerBuilder search results
(in this screenshot, NoScript is being used to identify the third parties whose code is embedded in the page)

Each of these tracking companies can track you over multiple different websites, effectively following you as you browse the web. They use either cookies, or hard-to-delete "super cookies", or other means, to link their records of each new page they see you visit to their records of all the pages you've visited in the previous minutes, months and years. The widespread presence of 3rd party web bugs and tracking scripts on a large proportion of the sites on the Web means that these companies can build up a long term profile of most of the things we do with our web browsers.

They can track us, but do they know who we are?

Given how much tracking firms know about our browsing history, it's worth asking whether these companies also know who we are. The answer, unfortunately, appears to be "yes", at least for those of us who use social networking sites.

A recent research paper by Balachander Krishnamurthy and Craig Wills shows that social networking sites like Facebook, LinkedIn and MySpace are giving the hungry cloud of tracking companies an easy way to add your name, lists of friends, and other profile information to the records they already keep on you.

The main theme of the paper is that when you log in to a social networking site, the social network includes advertising and tracking code in such a way that the 3rd party can see which account on the social network is yours. They can then just go to your profile page, record its contents, and add them to their file. Of the 12 social networks surveyed in the paper, only one (Orkut) didn't leak any personally identifying information to 3rd parties.

There are some interesting technical details in how the social networking sites leak this data. In some cases, the leakage may be unintentional, but in others, there is clever and surreptitious anti-privacy engineering at work.

Paths for Data Leakage from Social Networks to 3rd party Tracking Firms

The most obvious way that a 3rd party tracker might learn which account on a social networking site is yours is via the HTTP Referrer header. A typical URL on a social networking site includes a username or user ID number, and any 3rd party will be able to see that.1

A second and slightly more revealing method that some social networks use to leak personal information is through URL/URI parameters for the 3rd party content. Here's an anonymized example from the paper:

GET /track/?...&fb_sig_time=1236041837.3573&
     fb_sig_user=123456789&...
Host: adtracker.socialmedia.com
Referer: http://apps.facebook.com/kick_ass/...
(In this request, a Facebook app is sending the user's facebook user ID and signin time to to adtracker.socialmedia.com)

The third and most surprising method for leaking personal information is to alias 3rd party tracking servers into the host site's domain name in such a way that the 3rd party can see the host site's cookies, in violation of the same origin policy. Here's an example from the paper:

GET /st?ad_type=iframe&age=29&gender=M&e=&zip=11301&...
Host: ad.hi5.com
Referer: http://www.hi5.com/friend/profile/displaySameProfile.do?userid=123456789
Cookie: LoginInfo=M_AD_MI_MS|US_0_11301; Userid=123456789;Email=jdoe@email.com;
(ad.hi5.com is actually ad.yieldmanager.com, and it's receiving different bits of personal information via referrer, URI parameters, and the hi5.com cookie which the same origin policy wouldn't have allowed it to have — so it's an example of all three leakage methods at once)

What can I do to protect myself?

Unfortunately, there is no easy way to use modern, cookie- and JavaScript-dependent websites and social networking sites and avoid tracking at the same time. In order to be substantially protected against these tracking mechanisms, you'd need to do the following:

  1. Pick a good cookie policy for your browser, like "only keep cookies until I close my browser", or manual approval of all cookies.
  2. Disable Flash Cookies and all the other kinds of "super cookies". You can test for these here.
  3. Use the Firefox extensions RequestPolicy and NoScript to control when 3rd party sites can include content in your pages or run code in your browser, respectively. These tools are very effective, but be aware that they're hard to use: lots of sites that depend on JavaScript will need to be whitelisted before they work correctly.
  4. Use the Targeted Advertising Cookie Opt-Out plugin. This will automatically opt you out of any 3rd party trackers who have an opt out somewhere that requires you to accept a cookie. Be aware that not all 3rd parties will offer opt outs, or that some of them may interpret "opt out" to mean "do not show me targeted ads", rather than "do not track my behavior online".
  5. As always, it doesn't hurt to use Tor via TorButton to hide your IP address and other browser characteristics when you want maximal browser privacy.

Unfortunately, many of the steps above are quite difficult to follow, and we're fearful that the vast majority of Internet users will continue to be tracked by dozens of companies — companies they've never heard of, companies they have no relationship with, companies they would never choose to trust with their most private thoughts and reading habits.

It isn't going to be easy to fix this mess. On the technical side, all of this tracking follows from the design of the Web as an interactive hypertext system, combined with the fact that so many websites are willing to assist advertisers in tracking their visitors. Browsers could be altered to make them harder to track, but great care and clever design will be required to achieve that without undermining the virtues of interactive hypertext in the first place. It's not clear that anyone has found the right way to do that yet.

On the legal side, it's clear that the current U.S. privacy regime isn't working: behavioral tracking companies can put whatever they want in the fine print of their privacy policies, and few of the visitors to CareerBuilder or any other website will ever realize that the trackers are there, let alone read their policies. It's time we found legal rules to ensure that people actually know when their privacy is part of the price they pay to visit a site.

  1. 1. One subtlety here is that sometimes the 3rd party won't be able to tell whether a profile is yours or belongs to someone else. But there are several ways around that: they can look for URLs associated with profile editing or other activites that your friends can't do with to your profile; they can see which profile you visit first when you log in to the site, and they can see which profile you visit most often over time.

Related Issues: Online Behavioral Tracking, Privacy

[Permalink]

Donate to EFF
Make a One-Time Contribution
Join EFF
Become a Member

Subscribe to EFFector

EFF's Newsletter and Action Alerts

Headlines

  • Victory for Reader Privacy
  • The Humble Frozen Synapse Bundle
  • Don't Let Privacy Law Get Stuck in 1986
  • Open Source Security
  • 2011 Pioneer Awards!

Projects

  • Bloggers' Rights
  • Coders' Rights
  • FOIA Project
  • Follow EFF
  • Patent Busting
  • Surveillance Self-Defense
  • Teaching Copyright
  • Takedown Hall of Shame
  • TOSBack
  • Ways To Help
Want to learn how you can defend free speech, stand up for privacy, fight for government transparency, support consumer rights, and protect your right to innovation in the digital world? Visit http://eff.org/fight to find ways to help.
Creative Commons Licensed
  • Thanks
  • RSS Feeds
  • Copyright Policy
  • Privacy Policy
  • Contact EFF