What these factors do not take into account is exposure to patients with covid-19, say residents. That means the algorithm did not distinguish between those who had caught covid from patients and those who got it from community spread—including employees working remotely. And, as first reported by ProPublica, residents were told that because they rotate between departments rather than maintain a single assignment, they lost out on points associated with the departments where they worked.
The algorithm’s third category refers to the California Department of Public Health’s vaccine allocation guidelines. These focus on exposure risk as the single highest factor for vaccine prioritization. The guidelines are intended primarily for county and local governments to decide how to prioritize the vaccine, rather than how to prioritize between a hospital’s departments. But they do specifically include residents, along with the departments where they work, in the highest-priority tier.
It may be that the “CDPH range” factor gives residents a higher score, but still not high enough to counteract the other criteria.
“Why did they do it that way?”
Stanford tried to factor in a lot more variables than other medical facilities, but Jeffrey Kahn, the director of the Johns Hopkins Berkman Institute of Bioethics, says the approach was overcomplicated. “The more there are different weights for different things, it then becomes harder to understand—‘Why did they do it that way?’” he says.
Kahn, who sat on Johns Hopkins’ 20-member committee on vaccine allocation, says his university allocated vaccines based simply on job and risk of exposure to covid-19.
He says that decision was based on discussions that purposefully included different perspectives—including those of residents—and in coordination with other hospitals in Maryland. Elsewhere, the University of California San Francisco’s plan is based on a similar assessment of risk of exposure to the virus. Mass General Brigham in Boston categorizes employees into four groups based on department and job location, according to an internal email reviewed by MIT Technology Review.
“There’s so little trust around so much related to the pandemic, we cannot squander it.”
“It’s really important [for] any approach like this to be transparent and public …and not something really hard to figure out,” Kahn says. “There’s so little trust around so much related to the pandemic, we cannot squander it.”
Algorithms are commonly used in health care to rank patients by risk level in an effort to distribute care and resources more equitably. But the more variables used, the harder it is to assess whether the calculations might be flawed.
For example, in 2019, a study published in Science showed that 10 widely used algorithms for distributing care in the US ended up favoring white patients over Black ones. The problem, it turned out, was that the algorithms’ designers assumed that patients who spent more on health care were more sickly and needed more help. In reality, higher spenders are also richer, and more likely to be white. As a result, the algorithm allocated less care to Black patients with the same medical conditions as white ones.
Irene Chen, an MIT doctoral candidate who studies the use of fair algorithms in health care, suspects this is what happened at Stanford: the formula’s designers chose variables that they believed would serve as good proxies for a given staffer’s level of covid risk. But they didn’t verify that these proxies led to sensible outcomes, or respond in a meaningful way to the community’s input when the vaccine plan came to light on Tuesday last week. “It’s not a bad thing that people had thoughts about it afterward,” says Chen. “It’s that there wasn’t a mechanism to fix it.”
A canary in the coal mine?
After the protests, Stanford issued a formal apology, saying it would revise its distribution plan.
Hospital representatives did not respond to questions about who they would include in new planning processes, or whether the algorithm would continue to be used. An internal email summarizing the medical school’s response, shared with MIT Technology Review, states that neither program heads, department chairs, attending physicians, nor nursing staff were involved in the original algorithm design. Now, however, some faculty are pushing to have a bigger role, eliminating the algorithms’ results completely and instead giving division chiefs and chairs the authority to make decisions for their own teams.
Google has a lot riding on this launch. Microsoft partnered with OpenAI to make an aggressive play for Google’s top spot in search. Meanwhile, Google blundered straight out of the gate when it first tried to respond. In a teaser clip for Bard that the company put out in February, the chatbot was shown making a factual error. Google’s value fell by $100 billion overnight.
Google won’t share many details about how Bard works: large language models, the technology behind this wave of chatbots, have become valuable IP. But it will say that Bard is built on top of a new version of LaMDA, Google’s flagship large language model. Google says it will update Bard as the underlying tech improves. Like ChatGPT and GPT-4, Bard is fine-tuned using reinforcement learning from human feedback, a technique that trains a large language model to give more useful and less toxic responses.
Google has been working on Bard for a few months behind closed doors but says that it’s still an experiment. The company is now making the chatbot available for free to people in the US and the UK who sign up to a waitlist. These early users will help test and improve the technology. “We’ll get user feedback, and we will ramp it up over time based on that feedback,” says Google’svice president of research, Zoubin Ghahramani. “We are mindful of all the things that can go wrong with large language models.”
But Margaret Mitchell, chief ethics scientist at AI startup Hugging Face and former co-lead of Google’s AI ethics team, is skeptical of this framing. Google has been working on LaMDA for years, she says, and she thinks pitching Bard as an experiment “is a PR trick that larger companies use to reach millions of customers while also removing themselves from accountability if anything goes wrong.”
Google wants users to think of Bard as a sidekick to Google Search, not a replacement. A button that sits below Bard’s chat widget says “Google It.” The idea is to nudge users to head to Google Search to check Bard’s answers or find out more. “It’s one of the things that help us offset limitations of the technology,” says Krawczyk.
“We really want to encourage people to actually explore other places, sort of confirm things if they’re not sure,” says Ghahramani.
This acknowledgement of Bard’s flaws has shaped the chatbot’s design in other ways, too. Users can interact with Bard only a handful of times in any given session. This is because the longer large language models engage in a single conversation, the more likely they are to go off the rails. Many of the weirder responses from Bing Chat that people have shared online emerged at the end of drawn-out exchanges, for example.
Google won’t confirm what the conversation limit will be for launch, but it will be set quite low for the initial release and adjusted depending on user feedback.
Bard in action
GOOGLE
Google is also playing it safe in terms of content. Users will not be able to ask for sexually explicit, illegal, or harmful material (as judged by Google) or personal information. In my demo, Bard would not give me tips on how to make a Molotov cocktail. That’s standard for this generation of chatbot. But it would also not provide any medical information, such as how to spot signs of cancer. “Bard is not a doctor. It’s not going to give medical advice,” says Krawczyk.
Perhaps the biggest difference between Bard and ChatGPT is that Bard produces three versions of every response, which Google calls “drafts.” Users can click between them and pick the response they prefer, or mix and match between them. The aim is to remind people that Bard cannot generate perfect answers. “There’s the sense of authoritativeness when you only see one example,” says Krawczyk. “And we know there are limitations around factuality.”
Hoffman got access to the system last summer and has since been writing up his thoughts on the different ways the AI model could be used in education, the arts, the justice system, journalism, and more. In the book, which includes copy-pasted extracts from his interactions with the system, he outlines his vision for the future of AI, uses GPT-4 as a writing assistant to get new ideas, and analyzes its answers.
A quick final word … GPT-4 is the cool new shiny toy of the moment for the AI community. There’s no denying it is a powerful assistive technology that can help us come up with ideas, condense text, explain concepts, and automate mundane tasks. That’s a welcome development, especially for white-collar knowledge workers.
However, it’s notable that OpenAI itself urges caution around use of the model and warns that it poses several safety risks, including infringing on privacy, fooling people into thinking it’s human, and generating harmful content. It also has the potential to be used for other risky behaviors we haven’t encountered yet. So by all means, get excited, but let’s not be blinded by the hype. At the moment, there is nothing stopping people from using these powerful new models to do harmful things, and nothing to hold them accountable if they do.
Deeper Learning
Chinese tech giant Baidu just released its answer to ChatGPT
So. Many. Chatbots. The latest player to enter the AI chatbot game is Chinese tech giant Baidu. Late last week, Baidu unveiled a new large language model called Ernie Bot, which can solve math questions, write marketing copy, answer questions about Chinese literature, and generate multimedia responses.
A Chinese alternative: Ernie Bot (the name stands for “Enhanced Representation from kNowledge IntEgration;” its Chinese name is 文心一言, or Wenxin Yiyan) performs particularly well on tasks specific to Chinese culture, like explaining a historical fact or writing a traditional poem. Read more from my colleague Zeyi Yang.
Even Deeper Learning
Language models may be able to “self-correct” biases—if you ask them to
Large language models are infamous for spewing toxic biases, thanks to the reams of awful human-produced content they get trained on. But if the models are large enough, they may be able to self-correct for some of these biases. Remarkably, all we might have to do is ask.
Texas is trying to limit access to abortion pills by cracking down on internet service providers and credit card processing companies. These tactics reflect the reality that, post-Roe, the internet is a critical channel for people seeking information about abortion or trying to buy pills to terminate a pregnancy—especially in states where they can no longer access these things in physical pharmacies or medical centers.
Texas has long been a laboratory for anti-abortion political tactics, and on March 15, a US District Judge heard arguments in a case that’s seeking to reverse the FDA approval of mifepristone, a drug that can be used to terminate an early pregnancy. The case would limit online-facilitated abortions and would have far-reaching consequences even in states that are not trying to restrict abortion.
Earlier this month, Republicans in the Texas state legislature introduced two bills to restrict access to abortion pills. The first bill, HB 2690, would require internet service providers (ISPs) to ban sites that provide access to the pills or information about obtaining them. Companies like AT&T and Spectrum would have to “make every reasonable and technologically feasible effort to block Internet access to information or material intended to assist or facilitate efforts to obtain an elective abortion or an abortion-inducing drug.” The bill would also forbid both publishers and ordinary people from providing information about access to abortion-inducing drugs.
The second bill, SB 1440, would make it a felony for credit card companies to process transactions for abortion pills, and would also make them liable to lawsuits from the public.
Blair Wallace, a policy and advocacy strategist at the ACLU of Texas, a nonprofit that advocates for civil liberties and reproductive choice, said the recent developments mark “a new frontier for the ways in which they’re coming for [abortion access],” adding: “It is really terrifying.”
Wallace sees it as a continuation of a strategy that seeks to criminalize whole abortion care networks with the aim of isolating people seeking abortions. More broadly, this strategy of censoring information and language has become a popular tactic in US culture wars in the last several years, and the proposed bill could incentivize platforms to aggressively remove information about abortion access out of concern for legal risk. Some sites, like Meta’s Instagram and Facebook, have reportedly removed information about abortion pills in the past.
So what might the outcome of all the Texas action be? Both the bill that targets ISPs and the mifepristone case this week are unprecedented, which means neither is likely to be successful. That said, the tactics are likely to stay. “Will we see it again next session? Will we see parts of this bill stripped down and put into amendments? There’s like a million ways that this can play out,” says Wallace. Anti-abortion political strategy is coordinated nationally even though the fights are playing out at a state level, and it’s likely that other states will target online spaces going forward.
Online abortion resources can pose risks to privacy. But there are lots of ways to access them more safely. Here are some resources I recommend.