Special thanks to Yael Grauer for additional writing and research.
In June 2020, Santa Cruz, California became the first city in the United States to ban municipal use of predictive policing, a method of deploying law enforcement resources according to data-driven analytics that supposedly are able to predict perpetrators, victims, or locations of future crimes. Especially interesting is that Santa Cruz was one of the first cities in the country to experiment with the technology when it piloted, and then adopted, a predictive policing program in 2011. That program used historic and current crime data to break down some areas of the city into 500 foot by 500 foot blocks in order to pinpoint locations that were likely to be the scene of future crimes. However, after nine years, the city council voted unanimously to ban it over fears of how it perpetuated racial inequality.
Predictive policing is a self-fulfilling prophecy. If police focus their efforts in one neighborhood and arrest dozens of people there during the span of a week, the data will reflect that area as a hotbed of criminal activity. The system also considers only reported crime, which means that neighborhoods and communities where the police are called more often might see a higher likelihood of having predictive policing technology concentrate resources there. This system is tailor-made to further victimize communities that are already overpoliced—namely, communities of color, unhoused individuals, and immigrants—by using the cloak of scientific legitimacy and the supposed unbiased nature of data.
Santa Cruz’s experiment, and eventual banning of the technology is a lesson to the rest of the country: technology is not a substitute for community engagement and holistic crime reduction measures. The more police departments rely on technology to dictate where to focus efforts and who to be suspicious of, the more harm those departments will cause to vulnerable communities. That’s why police departments should be banned from using supposedly data-informed algorithms to inform which communities, and even which people, should receive the lion’s share of policing and criminalization.
What Is Predictive Policing?
The Santa Cruz ordinance banning predictive policing defines the technology as “means software that is used to predict information or trends about crime or criminality in the past or future, including but not limited to the characteristics or profile of any person(s) likely to commit a crime, the identity of any person(s) likely to commit crime, the locations or frequency of crime, or the person(s) impacted by predicted crime.”
Predictive policing analyzes a massive amount of information from historical crimes including the time of day, season of the year, weather patterns, types of victims, and types of location in order to infer when and in which locations crime is likely to occur. For instance, if a number of crimes have been committed in alleyways on Thursdays, the algorithm might tell a department they should dispatch officers to alleyways every Thursday. Of course, then this means that police are predisposed to be suspicious of everyone who happens to be in that area at that time.
The technology attempts to function similarly while conducting the less prevalent “person-based” predictive policing. This takes the form of opaque rating systems that assign people a risk value based on a number of data streams including age, suspected gang affiliation, and the number of times a person has been a victim as well as an alleged perpetrator of a crime. The accumulated total of this data could result in someone being placed on a “hot list”, as happened to over 1,000 people in Chicago who were placed on one such “Strategic Subject List.” As when specific locations are targeted, this technology cannot actually predict crime—and in an attempt to do so, it may expose people to targeted police harassment or surveillance without any actual proof that a crime will be committed.
There is a reason why the use of predictive policing continues to expand despite its dubious foundations: it makes money. Many companies have developed tools for data-driven policing; some of the biggest are PredPol, HunchLab, CivicScape, and Palantir. Academic institutions have also developed predictive policing technologies, such as Rutgers University’s RTM Diagnostics or Carnegie Mellon University’s CrimeScan, which is used in Pittsburgh. Some departments have built such tools with private companies and academic institutions. For example, in 2010, the Memphis Police Department built its own tool, in partnership with the University of Memphis Department of Criminology and Criminal Justice, using IBM SPSS predictive analytics.
As of summer 2020, the technology is used in dozens of cities across the United States.
What Problems Does it Pose?
One of the biggest flaws of predictive policing is the faulty data fed into the system. These algorithms depend on data informing them of where criminal activity has happened to predict where future criminal activity will take place. However, not all crime is recorded—some communities are more likely to report crime than others, some crimes are less likely to be reported than other crimes, and officers have discretion in deciding whether or not to make an arrest. Predictive policing only accounts for crimes that are reported, and concentrates policing resources in those communities, which then makes it more likely that police may uncover other crimes. This all creates a feedback loop that makes predictive policing a self-fulfilling prophecy. As professor Suresh Venkatasubramanian put it:
If you build predictive policing, you are essentially sending police to certain neighborhoods based on what they told you—but that also means you’re not sending police to other neighborhoods because the system didn’t tell you to go there. If you assume that the data collection for your system is generated by police whom you sent to certain neighborhoods, then essentially your model is controlling the next round of data you get.
Police are already policing minority neighborhoods and arresting people for things that may have gone unnoticed or unreported in less heavily patrolled neighborhoods. When this already skewed data is entered into a predictive algorithm, it will deploy more officers to the communities that are already overpoliced.
A recent deep dive into the predictive program used by the Pasco County Sheriff's office illustrates the harms that getting stuck in an algorithmic loop can have on people. After one 15-year-old was arrested for stealing bicycles out of a garage, the algorithm continuously dispatched police to harass him and his family. Over the span of five months, police went to his home 21 times. They showed up at his gym and his parent’s place of work. The Tampa Bay Times revealed that since 2015, the sheriff's office has made more than 12,500 similar preemptive visits on people.
These visits often resulted in other, unrelated arrests that further victimized families and added to the likelihood that they would be visited and harassed again. In one incident, the mother of a targeted teenager was issued a $2,500 fine when police sent to check in on her child saw chickens in the backyard. In another incident, a father was arrested when police looked through the window of the house and saw a 17-year-old smoking a cigarette. These are the kinds of usually unreported crimes that occur in all neighborhoods, across all economic strata—but which only those marginalized people who live under near constant policing are penalized for.
As experts have pointed out, these algorithms often draw from flawed and non-transparent sources such as gang databases, which have been the subject of public scrutiny due to their lack of transparency and overinclusion of Black and Latinx people. In Los Angeles, for instance, if police notice a person wearing a sports jersey or having a brief conversation with someone on the street, it may be enough to include that person in the LAPD’s gang database. Being included in a gang database often means being exposed to more police harassment and surveillance, and also can lead to consequences once in the legal system, such as harsher sentences. Inclusion in a gang database can impact whether a predictive algorithm identifies a person as being a potential threat to society or artificially projects a specific crime as gang-related. In July 2020, the California Attorney General barred police in the state from accessing any of LAPD’s entries into the California gang database after LAPD officers were caught falsifying data. Unaccountable and overly broad gang databases are the type of flawed data flowing from police departments into predictive algorithms, and exactly why predictive policing cannot be trusted.
To test racial disparities in predictive policing, Human Rights Data Analysis Group (HRDAG) looked at Oakland Police Department’s recorded drug crimes. It used a big data policing algorithm to determine where it would suggest that police look for future drug crimes. Sure enough, HRDAG found that the data-driven model would have focused almost exclusively on low-income communities of color. But public health data on drug users combined with U.S. Census data show that the distribution of drug users does not correlate with the program’s predictions, demonstrating that the algorithm’s predictions were rooted in bias rather than reality.
All of this is why a group of academic mathematicians recently declared a boycott against helping police create predictive policing tools. They argued that their credentials and expertise create a convenient way to smuggle racist ideas about who will commit a crime based on where they live and who they know, into the mainstream through scientific legitimacy. “It is simply too easy,” they write, “to create a 'scientific' veneer for racism.”
In addition, there is a disturbing lack of transparency surrounding many predictive policing tools. In many cases, it’s unclear how the algorithms are designed, what data is being used, and sometimes even what the system claims to predict. Vendors have sought non-disclosure clauses or otherwise shrouded their products in secrecy, citing trade secrets or business confidentiality. When data-driven policing tools are black boxes, it’s difficult to assess the risks of error rates, false positives, limits in programming capabilities, biased data, or even flaws in source code that affect search results.
For local departments, the prohibitive cost of using these predictive technologies can also be a detriment to the maintenance of civil society. In Los Angeles, the LAPD paid $20 million over the course of nine years to use Palantir’s predictive technology alone. That’s only one of many tools used by the LAPD in an attempt to predict the future.
Finally, predictive policing raises constitutional concerns. Simply living or spending time in a neighborhood or with certain people may draw suspicion from police or cause them to treat people as potential perpetrators. As legal scholar Andrew Guthrie Furgeson has written, there is tension between predictive policing and legal requirements that police possess reasonable suspicion to make a stop. Moreover, predictive policing systems sometimes utilize information from social media to assess whether a person might be likely to engage in crime, which also raises free speech issues.
Technology cannot predict crime, it can only weaponize a person’s proximity to police action. An individual should not have their presumption of innocence eroded because a casual acquaintance, family member, or neighbor commits a crime. This just opens up members of already vulnerable populations to more police harassment, erodes trust between public safety measures and the community, and ultimately creates more danger. This has already happened in Chicago, where the police surveil and monitor the social media of victims of crimes—because being a victim of a crime is one of the many factors Chicago’s predictive algorithm uses to determine if a person is at high risk of committing a crime themselves.
What Can Be Done About It?
As the Santa Cruz ban suggests, cities are beginning to wise up to the dangers of predictive policing. As with the growing movement to ban government use of face recognition and other biometric surveillance, we should also seek bans on predictive policing. Across the country, from San Francisco to Boston, almost a dozen cities have banned police use of face recognition after recognizing its disproportionate impact on people of color, its tendency to falsely accuse people of crimes, its erosion of our presumption of innocence, and its ability to track our movements.
Before predictive policing becomes even more widespread, cities should now take advantage of the opportunity to protect the well-being of their residents by passing ordinances that ban the use of this technology or prevent departments from acquiring it in the first place. If your town has legislation like a Community Control Over Police Surveillance (CCOPS) ordinance, which requires elected officials to approve police purchase and use of surveillance equipment, the acquisition of predictive policing can be blocked while attempts to ban the technology are made.
The lessons from the novella and film Minority Report still apply, even in the age of big data: people are innocent until proven guilty. People should not be subject to harassment and surveillance because of their proximity to crime. For-profit software companies with secretive proprietary algorithms should not be creating black box crystal balls exempt from public scrutiny and used without constraint by law enforcement. It’s not too late to put the genie of predictive policing back in the bottle, and that is exactly what we should be urging local, state, and federal leaders to do.