Governments Shouldn’t Use “Centralized” Proximity Tracking Technology

Companies and governments across the world are building and deploying a dizzying number of systems and apps to fight COVID-19. Many groups have converged on using Bluetooth-assisted proximity tracking for the purpose of exposure notification. Even so, there are many ways to approach the problem, and dozens of proposals have emerged.

One way to categorize them is based on how much trust each proposal places in a central authority. In more “centralized” models, a single entity—like a health organization, a government, or a company—is given special responsibility for handling and distributing user information. This entity has privileged access to information that regular users and their devices do not. In “decentralized” models, on the other hand, the system doesn’t depend on a central authority with special access. A decentralized app may share data with a server, but that data is made available for everyone to see—not just whoever runs the server.

Both centralized and decentralized models can claim to make a slew of privacy guarantees. But centralized models all rest on a dangerous assumption: that a “trusted” authority will have access to vast amounts of sensitive data and choose not to misuse it. As we’ve seen, time and again, that kind of trust doesn’t often survive a collision with reality. Carefully constructed decentralized models are much less likely to harm civil liberties. This post will go into more detail about the distinctions between these two kinds of proposals, and weigh the benefits and pitfalls of each.

Centralized Models

There are many different proximity tracking proposals that can be considered “centralized,” but generally, it means a single “trusted” authority knows things that regular users don’t. Centralized proximity tracking proposals are favored by many governments and public health authorities. A central server usually stores private information on behalf of users, and makes decisions about who may have been exposed to infection. The central server can usually learn which devices have been in contact with the devices of infected people, and may be able to tie those devices to real-world identities.

For example, a European group called PEPP-PT has released a proposal called NTK. In NTK, a central server generates a private key for each device, but keeps the keys to itself. This private key is used to generate a set of ephemeral IDs for each user. Users get their ephemeral IDs from the server, then exchange them with other users. When someone tests positive for COVID-19, they upload the set of ephemeral IDs from other people they’ve been in contact with (plus a good deal of metadata). The authority links those IDs to the private keys of other people in its database, then decides whether to reach out to those users directly. The system is engineered to prevent users from linking ephemeral IDs to particular people, while allowing the central server to do exactly that.

Some proposals, like Inria’s ROBERT, go to a lot of trouble to be pseudonymous—that is, to keep users’ real identities out of the central database. This is laudable, but not sufficient, since pseudonymous IDs can often be tied back to real people with a little bit of effort. Many other centralized proposals, including NTK, don’t bother. Singapore’s TraceTogether and Australia’s COVIDSafe apps even require users to share their phone numbers with the government so that health authorities can call or text them directly. Centralized solutions may collect more than just contact data, too: some proposals have users upload the time and location of their contacts as well.

Decentralized Models

In a “decentralized” proximity tracking system, the role of a central authority is minimized. Again, there are a lot of different proposals under the “decentralized” umbrella. In general, decentralized models don’t trust any central actor with information that the rest of the world can’t also see. There are still privacy risks in decentralized systems, but in a well-designed proposal, those risks are greatly reduced.

EFF recommends the following characteristics in decentralized proximity tracking efforts:

The goal should be exposure notification. That is, an automated alert to the user that they may have been infected by proximity to a person with the virus, accompanied by advice to that user about how to obtain health services. The goal should not be automated delivery to the government or anyone else of information about the health or person-to-person contacts of individual people.
A user’s ephemeral IDs should be generated and stored on their own device. The ephemeral IDs can be shared with devices the user comes into contact with, but nobody should have a database mapping sets of IDs to particular people.
When a user learns they are infected, as confirmed by a physician or health authority, it should be the user’s absolute prerogative to decide whether or not to provide any information to the system’s shared server.
When a user reports ill, the system should transmit from the user’s device to the system’s shared server the minimum amount of data necessary for other users to learn their exposure risk. For example, they may share either the set of ephemeral IDs they broadcast, or the set of IDs they came into contact with, but not both.
No single entity should know the identities of the people who have been potentially exposed by proximity to an infected person. This means that the shared server should not be able to “push” warnings to at-risk users; rather, users’ apps must “pull” data from the central server without revealing their own status, and use it to determine whether to notify their user of risk. For example, in a system where ill users report their own ephemeral IDs to a shared server, other users’ apps should regularly pull from the shared server a complete set of the ephemeral IDs of ill users, and then compare that set to the ephemeral IDs already stored on the app because of proximity to other users.
Ephemeral IDs should not be linkable to real people or to each other. Anyone who gathers lots of ephemeral IDs should not be able to tell whether they come from the same person.

Decentralized models don’t have to be completely decentralized. For example, public data about which ephemeral IDs correspond to devices that have reported ill may be hosted in a central database, as long as that database is accessible to everyone. No blockchains need to be involved. Furthermore, most models require users to get authorization from a physician or health authority before reporting that they have COVID-19. This kind of “centralization” is necessary to prevent trolls from flooding the system with fake positive reports.

Apple and Google’s exposure notification API is an example of a (mostly) decentralized system. Keys are generated on individual devices, and nearby phones exchange ephemeral IDs. When a user tests positive, they can upload their private keys—now called “diagnosis keys”—to a publicly accessible database. It doesn’t matter if the database is hosted by a health authority or on a peer-to-peer network; as long as everyone can access it, the contact tracing system functions effectively.

What Are the Trade-Offs?

There are benefits and risks associated with both models. However, for the most part, centralized models benefit governments, and the risks fall on users.

Centralized models make more data available to whoever sets themselves up as the controlling authority, and they could potentially use that data for far more than contact tracing. The authority has access to detailed logs of everyone that infected people came into contact with, and it can easily use those logs to construct detailed social graphs that reveal how people interact with one another. This is appealing to some health authorities, who would like to use the data gathered by these tools to do epidemiological research or measure the impact of interventions. But personal data collected for one purpose should not be used for another (no matter how righteous) without the specific consent of the data subjects. Some decentralized proposals, like DP-3T, include ways for users to opt-in to sharing certain kinds of data for epidemiological studies. The data shared in that way can be de-identified and aggregated to minimize risk.

More important, the data collected by proximity tracking apps isn’t just about COVID—it’s really about human interactions. A database that tracks who interacts with whom could be extremely valuable to law enforcement and intelligence agencies. Governments might use it to track who interacts with dissidents, and employers might use it to track who interacts with union organizers. It would also make an attractive target for plain old hackers. And history has shown that, unfortunately, governments don’t tend to be the best stewards of personal data.

Centralization means that the authority can use contact data to reach out to exposed people directly. Proponents argue that notifications from public health authorities will be more effective than exposure notification from apps to users. But that claim is speculative. Indeed, more people may be willing to opt-in to a decentralized proximity tracking system than a centralized one. Moreover, the privacy intrusion of a centralized system is too high.

Even in an ideal, decentralized model, there’s some degree of unavoidable risk of infection unmasking: that when someone reports they are sick, everyone they've been in contact with (and anyone with enough Bluetooth beacons) can theoretically learn the fact that they are sick. This is because lists of infected ephemeral IDs are shared publicly. Anyone with a Bluetooth device can record the time and place they saw a particular ephemeral ID, and when that ID is marked as infected, they learn when and where they saw the ID. In some cases this may be enough information to determine who it belonged to.

Some centralized models, like ROBERT, claim to eliminate this risk. In ROBERT’s model, users upload the list of IDs they have encountered to the central authority. If a user has been in contact with an infected person, the authority will tell them, "You have been potentially exposed," but not when or where. This is similar to the way traditional contact tracing works, where health authorities interview infected people and then reach out directly to those they’ve been in contact with. In truth, ROBERT’s model makes it less convenient to learn who’s infected, but not impossible.

Automatic systems are easy to game. If a bad actor only turns on Bluetooth when they’re near a particular person, they’ll be able to learn whether their target is infected. If they have multiple devices, they can target multiple people. Actors with more technical resources could more effectively exploit the system. It’s impossible to solve the problem of infection unmasking completely—and users need to understand that before they choose to share their status with any proximity app. Meanwhile, it’s easy to avoid the privacy risks involved with granting a central authority privileged access to our data.

Conclusion

EFF remains wary of proximity tracking apps. It is unclear how much they will help; at best, they will supplement tried-and-tested disease-fighting techniques like widespread testing and manual contact tracing. We should not pin our hopes on a techno-solution. And with even the best-designed apps, there is always risk of misuse of personal information about who we've been in contact with as we go about our days.

One point is clear: governments and health authorities should not turn to centralized models for automatic exposure notification. Centralized systems are unlikely to be more effective than decentralized alternatives. They will create massive new databases of human behavior that are going to be difficult to secure, and more difficult to destroy once this crisis is over.

Related Issues

COVID-19 and Digital Rights

Related Updates

Deeplinks Blog by Bennett Cyphers | November 10, 2021

Data Broker Veraset Gave Bulk Device-Level GPS Data to DC Government

In the first weeks of the COVID-19 pandemic, a location data broker called Veraset offered officials in Washington, DC full access to its proprietary database of “highly sensitive” device-level GPS data, collected from cell phones, for the entire DC metro area.The officials accepted the offer, according to public...

Deeplinks Blog by Alexis Hancock, Adam Schwartz, Jon Callas | August 31, 2021

Vaccine Passport Missteps We Should Not Repeat

Vaccine mandates are becoming increasingly urgent from public health officials and various governments. As they roll out, we must protect users of vaccine passports and those who do not want to use—or cannot use—a digitally scannable means to prove vaccination. We cannot let the tools used to fight for public...

Deeplinks Blog by Alexis Hancock, Adam Schwartz, Jon Callas | June 25, 2021

Decoding California's New Digital Vaccine Records and Potential Dangers

This post was updated on 6/29/21 to more accurately describe how New York is running its voluntary vaccine passport programThe State of California recently released what it calls a “Digital COVID-19 Vaccine Record.” It is part of that state’s recent easing of...

Deeplinks Blog by Alexis Hancock, Adam Schwartz, Hayley Tsukayama | April 22, 2021

No Digital Vaccine Bouncers

The U.S. is distributing more vaccines and the population is gradually becoming vaccinated. Returning to regular means of activity and movement has become the main focus for many Americans who want to travel or see family. An increasingly common proposal to get there is digital proof-of-vaccination,...

Deeplinks Blog by Jason Kelley | February 5, 2021

Online-Only Vaccine Distribution Will Leave Too Many Behind

As the rollout of COVID-19 vaccines has begun across the U.S., there have been numerous reports of people having trouble getting it—not just because of its limited availability, but also because some counties and states have chosen to require computer and Internet access to sign up. This...

Deeplinks Blog by Rory Mir | February 1, 2021

Keeping Up With "At Home with EFF": From Student Privacy to Online Censorship

We're excited to have yet another "At Home with the EFF" event coming up this Wednesday, February 3rd, with panels making sense of all the online censorship issues emerging this year. From the takedown of Parler to Trump getting banned, we'll offer an insider look on how censorship decisions...

Deeplinks Blog by Adam Schwartz | January 5, 2021

COVID-19 and Surveillance Tech: Year in Review 2020

Location tracking apps. Spyware to enforce quarantine. Immunity passports. Throughout 2020, governments around the world deployed invasive surveillance technologies to contain the COVID-19 outbreak.But heavy-handed tactics like these undercut public trust in government, precisely when trust is needed most. They also invade our privacy and...

Deeplinks Blog by Gennie Gebhart | December 24, 2020

How COVID Changed Content Moderation: Year in Review 2020

In a year that saw every facet of online life reshaped the coronavirus pandemic, online content moderation and platform censorship were no exception. After a successful Who Has Your Back? campaign in 2019 to encourage large platforms to adopt best practices and endorse the Santa Clara Principles, 2020...

Deeplinks Blog by Alexis Hancock, Hayley Tsukayama | December 16, 2020

Vaccine Passports: A Stamp of Inequity

A COVID vaccine has been approved and vaccinations have begun. With them have come proposals of ways to prove you have been vaccinated, based on the presumption that vaccination renders a person immune and unable to spread the virus. The latter is ...

Deeplinks Blog by Gennie Gebhart, Jason Kelley | December 10, 2020

California Has a New COVID Exposure Notification App

Today California joined dozens of other states and countries in launching its COVID-19 exposure notification app, CA Notify, built on Google and Apple’s Exposure Notification API. Google and Apple’s API is already used in 20 other U.S. states, as well as countries including Germany, the...

Related Issues

COVID-19 and Digital Rights

Governments Shouldn’t Use “Centralized” Proximity Tracking Technology

Governments Shouldn’t Use “Centralized” Proximity Tracking Technology

Centralized Models

Decentralized Models

What Are the Trade-Offs?

Conclusion

Related Issues

Related Updates

Related Issues

Follow EFF:

Contact

About

Issues

Updates

Press

Donate