Computer scientists often build algorithms with a keen focus on “solving the problem,” without considering the larger implications and potential misuses of the technology they’re creating. That’s how we wind up with machine learning that prevents qualified job applicants from advancing, or blocks mortgage applicants from buying homes, or creates miscarriages of justice in parole and other aspects of the criminal justice system.

James Mickens—a lifelong hacker, perennial wisecracker, and would-be philosopher-king who also happens to be a Harvard University professor of computer science—says we must educate computer scientists to consider the bigger picture early in their creative process. In a world where much of what we do each day involves computers of one sort or another, the process of creating technology must take into account the society it’s meant to serve, including the most vulnerable.

Mickens speaks with EFF's Cindy Cohn and Danny O’Brien about some of the problems inherent in educating computer scientists, and how fixing those problems might help us fix the internet.

play
Privacy info. This embed will serve content from simplecast.com



Listen on Apple Podcasts Badge Listen on Spotify Podcasts Badge  Subscribe via RSS badge

This episode is also available on the Internet Archive and on YouTube.

In this episode you’ll learn about:

  • Why it’s important to include non-engineering voices, from historians and sociologists to people from marginalized communities, in the engineering process
  • The need to balance paying down our “tech debt” —cleaning up the messy, haphazard systems of yesteryear—with innovating new technologies
  • How to embed ethics education within computer engineering curricula so students can identify and overcome challenges before they’re encoded into new systems
  • Fostering transparency about how and by whom your data is used, and for whose profit
  • What we can learn from Søren Kierkegaard and Stan Lee about personal responsibility in technology

Music:

Music for How to Fix the Internet was created for us by Reed Mathis and Nat Keefe of BeatMower.

This podcast is licensed Creative Commons Attribution 4.0 International, and includes the following music licensed Creative Commons Attribution 3.0 Unported by their creators: 

Resources:

Machine Learning Ethics:

Algorithmic Bias in Policing, Healthcare, and More:

Adversarial Interoperability and Data Fiduciaries:


Transcript: 

James: One of the fun things about being a computer scientist, as opposed to, let's say a roboticist, someone who actually builds physical things. I'm never going to get my eye poked out, because my algorithm went wrong. Like I'm never going to lose an arm or just be ruined physically because my algorithm didn't work at least on paper. Right? And so I think computer science does tend to draw people who like some of these very stark sort of contrasts, like either my algorithm worked or it didn't. But I think that what's ended up happening is that in the infancy of the field, you could kind of sort of take that approach and nothing too bad would happen.

But now when you think about everything we do in a day, there's a computer involved in almost all of that. And so as a result, you can no longer afford to say, I'm not going to think about the bigger implications of this thing, because I'm just a hobbyist, I'm just working on some little toy that's not going to be used by thousands or millions of people.

Cindy: That's James Mickens. He's a professor of computer science at Harvard School of Engineering and Applied Sciences and a director at the Berkman Klein Center for Internet and Society. He's also a lifelong hacker.

Danny: 

James is going to tell us about some of the problems in educating ethical computer scientists and we're going to talk about how fixing those problems might help us fix the internet.

Cindy: I'm Cindy Cohn, EFF's executive director.

Danny: And I'm Danny O'Brien special advisor to EFF. Welcome to How to Fix the Internet, a podcast of the Electronic Frontier Foundation.

Cindy: 
James thank you so much for joining us. It’s really exciting to talk to you about how computer scientists and other technically minded people will help us move toward a better future and what that future looks like when we get there. 

James: Well, hello. Thank you for that great introduction and thank you for inviting me to have a chat.

Cindy: So let's wallow in the bad for a minute before get to the good. What's broken in our internet society now, or at least the specific pieces that are most concerning to you?

James: Well, there are just so many things. I mean, I could just give you a wood cut, like from the medieval period, people are on fire. They're weird people with bird masks running around. It's a scene. But if I had to just pick a couple things, here are a couple things that I think are bad. I think that at a high level, one of the big challenges with technology right now is the careless application of various techniques or various pieces of software in a way that doesn't really think about what the collateral damage might be and in a way that doesn't really think about, should we be deploying this software in the first place. At this point, sort of a classic example is machine learning, right? So machine learning seems pretty neat.   But when you look at machine learning being applied to things like determining which job applications get forwarded up to the next level, determining who gets mortgages and who does not, determining who gets sentenced to parole versus a harsher sentence for example. What you end up seeing is that you have these really non-trivial applications of technology that have these real impacts in the actual world. It's not some abstract exercise where we're trying to simulate the thought process of an agent in a video game or something like this.

Danny: Is there something special about computer scientists that makes them like this? Is it hubris? Is it just a feeling like they've got the answer to all of the world's problems?

James: The way that we're sort of trained as computer scientists is to say here's a crisp description of what a problem is and then here are a concrete set of steps which can "fix that problem". And going through that series of steps of identifying the problem, coming up with an algorithm to "solve it" and then testing it, at first glance that seems very clean. And in fact, there are a couple simple problems we could think of that are very clean to solve.

So for example, I give you a bunch of numbers, how do you sort them. It seems like a pretty objective thing to do. We all have a very clear understanding of what numbers are and what order means. But now if I ask you to do something like find the best applicant for a particular job, even if you were to ask different humans what the answer to that question is, they would probably give you a bunch of different answers.

And so this idea that somehow, because computers manipulate binary data, zeros and ones, that somehow we're always going to have clean answers for things, or somehow always be able to take these intractable social problems and represent them in this very clean way in the digital space, it's just absolutely false. And I think machine learning is a particular example of how this goes astray. Because you end up seeing that you get this data, this data has biases in it, you train an algorithm that replicates the biases in the training data, and that just perpetuates the social problem that we see sort of in the pre digital world.

Cindy: When we were first looking at predictive policing, for instance, which is a set of technologies that try to allegedly predict where crime is going to happen, the short answer to this is it actually just predicts what the police are going to do. If you define the problem as well, police know where crime is, then you've missed a whole lot of crime that police never see and don't focus on and don't prioritize. So that was an early example, I think, of that kind of problem.

James: People who live in let's say underprivileged communities or over policed communities, if you asked them what would happen if you were to apply one of these predictive policing algorithms, I bet a lot of them could intuitively tell you from their personal experience, well, the police go where they think the police need to go. And of course, that sets up a feedback circle. And just to be clear, I'm not trying to take out some sort of maximalist anti-police position here, I'm just saying there are experiences in the world that are important to bring to bear when you design technical artifacts, because these technical artifacts have to relate to society. So I think it's really important when you're getting a technical education that you also learn about things involving history or sociology or economics, things like that.

Cindy: I want to switch just a little bit, because we're trying to fix the internet here and I want to hear what's your vision of what it looks like if we get this right.I want to live in that world, what does that world look like from where you sit?

James: Well, a key aspect of that world is that I have been nominated as the philosopher king.

Cindy: Cool.

James: And that's the first thing and really everything sort of follows.

Danny: We'll get right on that.

James: Good to see everyone agrees with it.

Cindy: Yeah.

James: Yeah. Thank you. Thank you. So I think we've sort of hinted at one of the things that needs to change in my opinion, which is the way that "technical education" is carried out. A lot of engineers go through their formal engineering training and they're taught things like calculus and linear algebra. They learn about various programming languages. They learn how to design algorithms that run quickly. These are all obviously very important things, but they oftentimes don't receive in that formal education an understanding of how the artifacts that they build will interact with larger society. And oftentimes they don't receive enough education in what are sort of the historical and social and economic trends independent of technology, that have existed for hundreds or thousands of years that you should really think about if you want to create technology that helps the common good.

Cindy: And the other thing I hear in this is community involvement, right? That the people who are going to be impacted by the artifact you build need to be some of the people you listen to and that you check into that you go to the neighborhoods where this might be applied or you talk to the people who are trying to figure out how to get a mortgage and you begin to understand what the world looks like in shoes that are not yours. 

Are there any places in machine learning where you think that people are starting to get it right or is it still just a wasteland of bad ideas?

Danny: Allegedly.

James: It is. Yeah. The wasteland word is, I still think, generally applicable, but people are starting to awaken. People are starting to look at notions of, can we rigorously define transparency in terms of explaining what these algorithms do? Can we sort of rigorously think about bias and how we might try to address that algorithmically in collaboration with people. The field is starting to get better. I think there is still a lot of pressure to "innovate". There's still pressure to publish a lot of papers, get your cool new ML technology out there, how else am I going to get venture capital, things like this. So I think there's still a lot of pressure towards not being thoughtful, but I do see that changing.

Danny: So one of the things that we've seen in other podcast interviews is that actually we are going to have to go and redo some of the fundamentals because we're building on weak foundations, that we didn't think about computer security when we first started writing operating systems for general use and so forth. Do you think that's part of this as well? Not only do we have to change what we're going to do in the future, but we actually have to go and redo some stuff that engineers made in the past?

James: I think it speaks to these larger issues of tech debt, which is a term that you may have heard before. This idea that we've already built a bunch of stuff and so for us to go back and then fix it, for some definition of fix. So would you prefer us to just tackle that problem and not innovate further or would you prefer... What should we do? I think you're right about that. That is an important thing. If you look at, for example, how a lot of the internet protocols work or like how a lot of banking protocols work or things like this, systems for doing airline reservations, in some cases, this code is COBOL code. It came from the stone age, at least in computer science terms. 

And the code is very creaky. It has security problems. It's not fast in many cases, but would society tolerate no flights for a year, let's say, as we go back and we modernize that stuff? The answer is no obviously. So then as a result, we kind of creak forward. If you think about the basic core internet infrastructure, when it was designed, roughly speaking, it was like a small neighborhood. Most people on the internet knew everybody. Why would Sally ever try to attack my computer? I know her, our kids go to the same school, that would just be outrageous. But now we live in a world where the Internet's pervasive. That's good, but now everyone doesn't know everyone. And now there are bad actors out there. And so we can try to add security incrementally- that's what HTTPS does. The S stands for security, right? So we can try to layer security at top these sort of creaky ships, but it's hard. I think a lot of our software and hardware artifacts are like that. 

It's really getting back, I think, to Cindy's question too, about what would I want to see improved about the future? I always tell this to my students and I wish more people would think about this, it's easier to fix problems early, rather than later. That seems like a very obvious thing that Yoda would say, but it’s actually quite profound.  Because once you get things out in the world and once they get a lot of adoption, for you to change any little thing about it is going to be this huge exercise. And so it's really helpful to be thoughtful at the beginning in the design process.

Cindy: You've thought a little bit of about how we could get more thoughtfulness into the design process. And I'd love for you to talk about some of those ideas.

James: Sure. One thing that I'm really proud of working on is this embedded ethics program that we have at Harvard, and that's starting to be adopted by other institutions. And it gets back to this idea of what does it mean to train an engineer? And so what we're trying to do in this program is ensure that every class that a computer scientist takes, there'll be at least one lecture that talks about ethical considerations, concerns involving people and society and the universe that are specific to that class. Now, I think the specific to that class part is very important, right? Because I think another thing that engineers sometimes get confused about is they might say, oh, well, these ethical concerns are only important for machine learning.

I get it, machine learning interacts of people, but it's not important for people who build data centers. Why should I care about those things? But let's interrogate that for a second. Where do you build data center? Well, data centers require a lot of power. So where is that electricity going to come from? How is that electricity going to be generated? What is the impact on the surrounding community? Things like this. There's also sort of like these interesting geopolitical concerns there. So how many data centers should we have in North America versus Africa? What does the decision that we come to say about how we value different users in different parts of the world? 

As computer scientists, we have to accept this idea: we don't know everything, close to everything, but not everything, right? And so one of the important aspects of this embedded ethics program is that we bring in philosophers and collaborate with them and help use their knowledge to ground our discussions of these philosophical challenges in computer science.   

Cindy: Do you have any success stories yet, or is it just too early?

James: Well, some of the success stories involve students saying I was thinking about going to company X, but now I've actually decided not to go there because I've actually thought about what these companies are doing. I'm not here to name or shame, but suffice it to say that I think that's a really big metric for success  And we're actually trying to look at assessment instruments, talk to people from sociology or whatnot who know how to assess effectiveness and then tweak pedagogical programs to make sure that we're actually having the impact that we want.

Cindy: Well, I hope that means that we're going to have a whole bunch of these students beat a path to EEF's door and want to come and do tech for good with us because we've been doing it longer than anyone. 

Danny: “How to Fix the Internet” is supported by The Alfred P. Sloan Foundation’s Program in Public Understanding of Science. Enriching people’s lives through a keener appreciation of our increasingly technological world and portraying the complex humanity of scientists, engineers, and mathematicians.

Cindy: We're landing some societal problems on the shoulders of individual computer scientists and expecting them to kind of incorporate a lot of things that really are kind of built into our society like the venture capital interest in creating new products as quickly as possible, the profit motive or these other things. And I'm just wondering how poor little ethics can do standing up against some of these other forces.

James: I think sort of the high level sort of prompts is late stage capitalism, what do we do about it?

Cindy: Fair enough.

James: You are right, there And alas, I don't have immediate solutions to that problem.

Cindy: But you're supposed to be the philosopher king, my friend..

James: Fair enough. So you're right. I think that there's not like a magic trick we can do where we can say, oh, well, we'll just teach computer scientists and ethics and then all of a sudden the incentives for VCs will be changed because the incentives for VCs are make a lot of money, frequently make a lot money over the short term. They are not incentivized by the larger economy to act differently. But I think that the fact that better trained engineers can't solve all problems shouldn't prevent us from trying to help them to solve some problems. 

I think that there's a lot of good that those types of engineers can do and try to start changing some of these alignments. And there's a responsibility that should come with making products that affect potentially millions of people. So I sometimes hear this from students though. You're exactly right. Sometimes they'll say it's not my job to change sort of the larger macroeconomic incentive structures that make various things happen.

But then I say, well, but what are some of the biggest drivers of those macroeconomic incentive structures? It's tech companies. When you look at sort of stock market valuations and economic influence, it's these companies that you, the student, will be going to, that are helping to shape these narratives. And also too, it's you, the students, you'll go out, you'll vote. You'll think about ballot referendums, things like that. So there are things that we all have the responsibility to think about and to do individually, even though any one of us can't just sort of snap our fingers and make the change be immediate. We have to do that because otherwise society falls apart.

Danny: So some of this discussion assumes that we have like universal ethics that we all agree on, but I think there's always, I mean, part of the challenge in society is that we have room to disagree. Is there a risk that if we inject this sort of precautionary principle into what we are doing, we're actually missing out on some of the benefits of this rapid change? If we hold back and go, well, maybe we shouldn't do this, we're excluding the chance that these things will actually make society much, much better for everyone?

James: As an engineer trying to design a system to be "value neutral", that in and of itself is an ethical decision. You've made the decision to say like not considering social or economic factors X, Y, and Z is the right thing to do. That is an ethical decision. And so I think a lot of engineers though, they fall into that fallacy. They say, well, I'm just going to focus on the code. I'm just going to focus on the thing I'm going to build. And it'll be the users of that software that have to determine how to use it ethically or not.

But that argument is that just doesn't work.  The mere fact that people may disagree over values does not absolve us of the responsibility from thinking about those values nonetheless.

Cindy: To me, especially in a situation in which you're building something that's going to impact people who aren't involved in the building of it, right? I mean, you can build your own machine learning to tell you what you want about your life. And I don't have much to say about that, but a lot of these systems are making decisions for people who have no input whatsoever into how these things are being built, no transparency into how they're working and no ability to really interrogate the conclusions that are made. And to me, that's where it gets the riskiest.

James: I often turn to existential philosophy in cases like this. For the listeners who aren't familiar with philosophy, or think that it's all very obtuse, that's true about some of it. But if you read the existentialists, it's quite beautiful, a lot of the prose. It's just really fun to read, and it has these really impactful observations. And one of my favorite passages is from this guy, Kierkegaard. And Kierkegaard's talking about sort of like this burden of choice that we have. And he has this really beautiful metaphor where he says we are each the captain of our own ship.

And even if we choose not to put our hand on the rudder to point the ship in some direction, the wind will nevertheless push us towards some shore. And so in deciding where you want to go, you make a decision. If you decide not to make an active decision about where to sail your boat, you're basically deciding I will let the wind tell me where to go. The metaphor is telling us that your boat's still going to go in some direction even if you don't actively become the captain of it.

And I think about that a lot, because a lot of engineers want to abdicate themselves with the responsibility for being the captain of their own boat. And they say, I'm just going to focus on the boat and that's it. But in this metaphor sort of society and built in biases and things like that, those are the winds. Those are the currents. And they're going to push your product. They're going to push your software towards some shore and that's going to happen regardless of whether you think that's going to happen or not. So we really have this responsibility to choose and decide.

Danny: I hate to follow Kierkegaard with Stan Lee, but is that with great power comes great responsibility. And I wonder if part of these ethical discussions is whether that's not the problem. That you are asking engineers and the creators of this technology to make ethical decisions sort of that will affect the rest of society. And the problem is that actually it should be the rest of society that makes those decisions and not the engineers   maybe the harder work is to spread that power more equally and give everyone a little element of being an engineer like that they can change the technology in front of them. 

James: I think that what you're talking about sort of at a broad level is governance. How do we do governance of online systems? And it's a mess right now. It's a combination of internal company policies, which are not made public, external, that is to say publicly visible policies regulation, the behavior of individual users on the platform. And it's a big mess. Because I think that right now, a lot of times what happens is a disaster happens and then all of a sudden there's some movement by both the companies and maybe regulators to change something thing, and then that'll be it for a bit. And then things kind of creak along then another disaster happens. So it'd be nice to think about, in a more systemic way, how we should govern these platforms. 

Cindy: As a free speech, fourth amendment lawyer, having governments have more say over the things that we say in our privacy and those kinds of things, well, that hasn't always worked out all that well for individual rights either, right? But we have these gigantic companies. They have a lot of power and it's reasonable to think, well, what else has a lot of power that might be able to be a check on them? Well, there's government. And that's all true, but the devil really is in the details and we worry as much about bad corporate behavior as we do bad governmental behavior. And you have to think about both. 

Cindy: So let's say you're the philosopher king or in your great new world, what does it look like for me as a user in this future world ?

James: I think one important aspect is more transparency about how your data is used, who it gets shared with, what is the value that companies are getting from it. And we're moving a little bit in that direction slowly but surely. Laws like GDPR, CCPA, they're trying to slowly nudge us in this direction. It's a very hard problem though, as we all know. I mean, engineers may not fully understand what their systems do. So then how are they going to explain that in a transparent way to users. But in sort of this utopia, that's an important aspect of online services. There's more transparency in how things work. I think there's also more consent in how things work. So these things go hand in hand. So users would have more of an ability to opt into or opt out of various manipulations or sharings of their data.

Once again, we're starting to go a little bit closer towards that. I think we can do much, much more. I think that in terms of content moderation, I think, and this is going to be tricky, it's going to be hard, this speaks to sort of Cindy's observations about, well, we can't fully trust government or the companies. But in my opinion, I mean, I'm the philosopher king in this experiment. So in my opinion, what I want to have is I want to have a floor that defines sort of minimal standards for protections against hate speech, harassment, things like that. Of course the devils and the details. But I think that's actually something that we don't really have right now. There's also this important aspect of having educated like citizens, right? So having more technical education and technical literacy for laypeople so that they can better understand the consequences of their action. 

Cindy: That we know what choices we're making, we're in charge of these choices and have actual choices, I think are all tremendously important. EFF has worked a lot around adversarial interoperability and other things which are really about being able to leave a place that isn't serving you. And to me, that's got to be a piece of the choice. A choice that doesn't really let you leave is not actually a choice.

James: As you may know, there have been some recent proposals that want to solve this portability issue essentially by saying, let's have users store all their data on user owned machines and then the companies have to come to us for permission to use that data. There's a sort of push and pull there in terms of, on the one hand wanting to give people literal power over their data, such that it's actually their machines that are storing it versus saying, well, if I look at like the computers that are administered by my relatives, for example, who are not computer scientists, these computers are offline all the time. They've got like terrible, ridiculous programs on them. They're not reliable. Now in contrast, you look at a data center, that's administered by paid professionals whose job it is to keep those machines online. So there's an advantage to using that model.

Do we want to still keep our data in centralized places, but then make sure there's plumbing to move stuff between those centralized places or do we want to, in the extreme, go towards this peer to peer decentralized model and then lose some of the performance benefits we get from the data center model?

Cindy: That's a good articulation of some of the trade-offs here. And of course the other way to go is kind of on the lawyer side of things is a duty of care that people who hold your data have a fiduciary or something similar kind of duty to you in the same way that your accountant or lawyer might have. So they have your data, but they don't have the freedom to do with it what they want. In fact, they're very limited in what they can do with it.  I feel very optimistic in a certain way that there are mechanisms on the technical side and the non-technical side to try to get us to this kind of control. Again, none of them are without trade-offs, but they exist all across the board.

James: Yes. And I think an interesting area of research, it's an area that I'm a bit interested in myself, is what are specific technical things that software developers can do to provide obvious compliance with legal regulations. Because these laws, they're just like any human creation. They can be vague or ambiguous in some cases, they can be difficult to implement. 

And I think that part of this gets down to having these different communities talk to each other. One reason it's difficult for computer scientists to write code that complies with legal requirements is that we don't understand some of these legal requirements. The lawyers need to learn a little bit more about code and the computer scientists need to learn a little bit more about the law.

Cindy: It's also the case, of course, that sometimes laws get written without a clear idea of how one might reduce it to ones and zeros. And so that may be a bug if you're a computer scientist, it might be a feature if you're a lawyer, right? Because then we let judges sort out in the context of individual situations what things really mean. 

James: So one of the gifts of the philosopher king to lure people under these semantic morasses 

Cindy: Thank you so much king.

James: No problem of course. It's been great sitting here chatting with you. Let me return back to my kingdom.

Danny: James Mickens, thank you very much.

James: Thank you.

Cindy: Well, James teaches computer science at Harvard, so it's right that his focus is on education and personal ethics and transparency. This is the work of the computer scientists. And I appreciate that he's working and thinking hard about how we build more ethical builders and also that he's recognizing that we need to kind of move beyond the silos that computer science often finds itself in and reach out to people with other kinds of expertise, especially philosophy. But we also heard from him about the importance of the role of the impacted community, which is something we've heard over and over again in this podcast and the need to make sure that the people who are impacted by technology understand how it works and have a voice.

Danny: It wasn't just sort of this literally academic kind of discussion. He had some practical points too, I mean, for instance, that if we do need to improve things and fix things, we found some ways of doing incremental security improvements like HTTPS, but some really have to overcome a lot of tech debt. And I don't think we're going to be in a situation where we can ask people not to book airplane tickets while we fix the fundamentals, which again, points out to what he's saying, which is that we need to get this stuff right earlier rather than later in this process.

Cindy: And I loved hearing about this embedded ethics program that he's working on at Harvard and at other places and the idea that we need to build ethics into every class and every situation, not just something we tack on separately at the end, I think is a very good start. And of course, if it leads to a line of students who want to do ethical tech beating their way to EFFs doors, that would be an extra bonus for us.

Danny: It does make everything a little bit more complicated to think of ethics and the wider impact. I mean, I did take on board his comparison of the ease of building a centralized internet, which might have deleterious effects on society with the obvious solution, which is to decentralize things. But you have to make that just as easy to use for the end user and then somebody who's hacking away trying to build a decentralized web, that's something I definitely took personally and will take on board.

Cindy: There's trade-offs everywhere you go. And I think in that way, James is just a true educator, right? He's requiring us all to look at the complexities in all directions so that we can really bring all those complexities into thinking about the solutions we embrace. After this conversation, I kind of want to live in the world where James is our philosopher king.

Danny: Thanks to you, James Mickens, our supreme leader and thanks you for listening today. Please visit eff.org/podcast for other episodes, or to become a member. Members are the only reason we can do this work. Plus you can get cool stuff like an EFF hat or an EFF hoodie, or even an EFF camera cover for your laptop. Music for How to Fix the Internet was created for us by Reed Mathis and Nat Keefe of BeatMower. This podcast is licensed Creative Commons Attribution 4.0 International and includes music licensed under the Creative Commons Attribution 3.0 imported license by their creators. You can find those creators names and links to their music in our episode notes or on our website at eff.org/podcast. How to Fix the Internet is supported by Alfred P. Sloan Foundation's Program in Public Understanding of Science and Technology. I'm Danny O'Brien.

Cindy: And I'm Cindy Cohn.

 

 

James Mickens is a professor of computer science at the Harvard School of Engineering and Applied Sciences and a director at the Berkman Klein Center for Internet and Society. He studies how to make distributed systems faster, more robust, and more secure; much of his work focuses on large-scale web services, and how to design principled system interfaces for those services. Before Harvard, he spent seven years as a researcher at Microsoft; he was also a visiting professor at MIT. Mickens received a B.S. from the Georgia Institute of Technology and a Ph.D. from the University of Michigan, both in computer science.