Similarly, New York City is considering Int 1894, a law that would introduce mandatory audits of “automated employment decision tools,” defined as “any system whose function is governed by statistical theory, or systems whose parameters are defined by such systems.” Notably, both bills mandate audits but provide only high-level guidelines on what an audit is.
As decision-makers in both government and industry create standards for algorithmic audits, disagreements about what counts as an algorithm are likely. Rather than trying to agree on a common definition of “algorithm” or a particular universal auditing technique, we suggest evaluating automated systems primarily based on their impact. By focusing on outcome rather than input, we avoid needless debates over technical complexity. What matters is the potential for harm, regardless of whether we’re discussing an algebraic formula or a deep neural network.
Impact is a critical assessment factor in other fields. It’s built into the classic DREAD framework in cybersecurity, which was first popularized by Microsoft in the early 2000s and is still used at some corporations. The “A” in DREAD asks threat assessors to quantify “affected users” by asking how many people would suffer the impact of an identified vulnerability. Impact assessments are also common in human rights and sustainability analyses, and we’ve seen some early developers of AI impact assessments create similar rubrics. For example, Canada’s Algorithmic Impact Assessment provides a score based on qualitative questions such as “Are clients in this line of business particularly vulnerable? (yes or no).”
There are certainly difficulties to introducing a loosely defined term such as “impact” into any assessment. The DREAD framework was later supplemented or replaced by STRIDE, in part because of challenges with reconciling different beliefs about what threat modeling entails. Microsoft stopped using DREAD in 2008.
In the AI field, conferences and journals have already introduced impact statements with varying degrees of success and controversy. It’s far from foolproof: impact assessments that are purely formulaic can easily be gamed, while an overly vague definition can lead to arbitrary or impossibly lengthy assessments.
Still, it’s an important step forward. The term “algorithm,” however defined, shouldn’t be a shield to absolve the humans who designed and deployed any system of responsibility for the consequences of its use. This is why the public is increasingly demanding algorithmic accountability—and the concept of impact offers a useful common ground for different groups working to meet that demand.
Kristian Lum is an assistant research professor in the Computer and Information Science Department at the University of Pennsylvania.
Rumman Chowdhury is the director of the Machine Ethics, Transparency, and Accountability (META) team at Twitter. She was previously the CEO and founder of Parity, an algorithmic audit platform, and global lead for responsible AI at Accenture.
Uber’s facial recognition is locking Indian drivers out of their accounts
Uber checks that a driver’s face matches what the company has on file through a program called “Real-Time ID Check.” It was rolled out in the US in 2016, in India in 2017, and then in other markets. “This prevents fraud and protects drivers’ accounts from being compromised. It also protects riders by building another layer of accountability into the app to ensure the right person is behind the wheel,” Joe Sullivan, Uber’s chief security officer, said in a statement in 2017.
But the company’s driver verification procedures are far from seamless. Adnan Taqi, an Uber driver in Mumbai, ran into trouble with it when the app prompted him to take a selfie around dusk. He was locked out for 48 hours, a big dent in his work schedule—he says he drives 18 hours straight, sometimes as much as 24 hours, to be able to make a living. Days later, he took a selfie that locked him out of his account again, this time for a whole week. That time, Taqi suspects, it came down to hair: “I hadn’t shaved for a few days and my hair had also grown out a bit,” he says.
More than a dozen drivers interviewed for this story detailed instances of having to find better lighting to avoid being locked out of their Uber accounts. “Whenever Uber asks for a selfie in the evenings or at night, I’ve had to pull over and go under a streetlight to click a clear picture—otherwise there are chances of getting rejected,” said Santosh Kumar, an Uber driver from Hyderabad.
Others have struggled with scratches on their cameras and low-budget smartphones. The problem isn’t unique to Uber. Drivers with Ola, which is backed by SoftBank, face similar issues.
Some of these struggles can be explained by natural limitations in face recognition technology. The software starts by converting your face into a set of points, explains Jernej Kavka, an independent technology consultant with access to Microsoft’s Face API, which is what Uber uses to power Real-Time ID Check.
“With excessive facial hair, the points change and it may not recognize where the chin is,” Kavka says. The same thing happens when there is low lighting or the phone’s camera doesn’t have a good contrast. “This makes it difficult for the computer to detect edges,” he explains.
But the software may be especially brittle in India. In December 2021, tech policy researchers Smriti Parsheera (a fellow with the CyberBRICS project) and Gaurav Jain (an economist with the International Finance Corporation) posted a preprint paper that audited four commercial facial processing tools—Amazon’s Rekognition, Microsoft Azure’s Face, Face++, and FaceX—for their performance on Indian faces. When the software was applied to a database of 32,184 election candidates, Microsoft’s Face failed to even detect the presence of a face in more than 1,000 images, throwing an error rate of more than 3%—the worst among the four.
It could be that the Uber app is failing drivers because its software was not trained on a diverse range of Indian faces, Parsheera says. But she says there may be other issues at play as well. “There could be a number of other contributing factors like lighting, angle, effects of aging, etc.,” she explained in writing. “But the lack of transparency surrounding the use of such systems makes it hard to provide a more concrete explanation.”
The Download: Uber’s flawed facial recognition, and police drones
One evening in February last year, a 23-year-old Uber driver named Niradi Srikanth was getting ready to start another shift, ferrying passengers around the south Indian city of Hyderabad. He pointed the phone at his face to take a selfie to verify his identity. The process usually worked seamlessly. But this time he was unable to log in.
Srikanth suspected it was because he had recently shaved his head. After further attempts to log in were rejected, Uber informed him that his account had been blocked. He is not alone. In a survey conducted by MIT Technology Review of 150 Uber drivers in the country, almost half had been either temporarily or permanently locked out of their accounts because of problems with their selfie.
Hundreds of thousands of India’s gig economy workers are at the mercy of facial recognition technology, with few legal, policy or regulatory protections. For workers like Srikanth, getting blocked from or kicked off a platform can have devastating consequences. Read the full story.
I met a police drone in VR—and hated it
Police departments across the world are embracing drones, deploying them for everything from surveillance and intelligence gathering to even chasing criminals. Yet none of them seem to be trying to find out how encounters with drones leave people feeling—or whether the technology will help or hinder policing work.
A team from University College London and the London School of Economics is filling in the gaps, studying how people react when meeting police drones in virtual reality, and whether they come away feeling more or less trusting of the police.
MIT Technology Review’s Melissa Heikkilä came away from her encounter with a VR police drone feeling unnerved. If others feel the same way, the big question is whether these drones are effective tools for policing in the first place. Read the full story.
Melissa’s story is from The Algorithm, her weekly newsletter covering AI and its effects on society. Sign up to receive it in your inbox every Monday.
I met a police drone in VR—and hated it
It’s important because police departments are racing way ahead and starting to use drones anyway, for everything from surveillance and intelligence gathering to chasing criminals.
Last week, San Francisco approved the use of robots, including drones that can kill people in certain emergencies, such as when dealing with a mass shooter. In the UK most police drones have thermal cameras that can be used to detect how many people are inside houses, says Pósch. This has been used for all sorts of things: catching human traffickers or rogue landlords, and even targeting people holding suspected parties during covid-19 lockdowns.
Virtual reality will let the researchers test the technology in a controlled, safe way among lots of test subjects, Pósch says.
Even though I knew I was in a VR environment, I found the encounter with the drone unnerving. My opinion of these drones did not improve, even though I’d met a supposedly polite, human-operated one (there are even more aggressive modes for the experiment, which I did not experience.)
Ultimately, it may not make much difference whether drones are “polite” or “rude” , says Christian Enemark, a professor at the University of Southampton, who specializes in the ethics of war and drones and is not involved in the research. That’s because the use of drones itself is a “reminder that the police are not here, whether they’re not bothering to be here or they’re too afraid to be here,” he says.
“So maybe there’s something fundamentally disrespectful about any encounter.”
GPT-4 is coming, but OpenAI is still fixing GPT-3
The internet is abuzz with excitement about AI lab OpenAI’s latest iteration of its famous large language model, GPT-3. The latest demo, ChatGPT, answers people’s questions via back-and-forth dialogue. Since its launch last Wednesday, the demo has crossed over 1 million users. Read Will Douglas Heaven’s story here.
GPT-3 is a confident bullshitter and can easily be prompted to say toxic things. OpenAI says it has fixed a lot of these problems with ChatGPT, which answers follow-up questions, admits its mistakes, challenges incorrect premises, and rejects inappropriate requests. It even refuses to answer some questions, such as how to be evil, or how to break into someone’s house.