It also muddies the origin of certain data sets. This can mean that researchers miss important features that skew the training of their models. Many unwittingly used a data set that contained chest scans of children who did not have covid as their examples of what non-covid cases looked like. But as a result, the AIs learned to identify kids, not covid.
Driggs’s group trained its own model using a data set that contained a mix of scans taken when patients were lying down and standing up. Because patients scanned while lying down were more likely to be seriously ill, the AI learned wrongly to predict serious covid risk from a person’s position.
In yet other cases, some AIs were found to be picking up on the text font that certain hospitals used to label the scans. As a result, fonts from hospitals with more serious caseloads became predictors of covid risk.
Errors like these seem obvious in hindsight. They can also be fixed by adjusting the models, if researchers are aware of them. It is possible to acknowledge the shortcomings and release a less accurate, but less misleading model. But many tools were developed either by AI researchers who lacked the medical expertise to spot flaws in the data or by medical researchers who lacked the mathematical skills to compensate for those flaws.
A more subtle problem Driggs highlights is incorporation bias, or bias introduced at the point a data set is labeled. For example, many medical scans were labeled according to whether the radiologists who created them said they showed covid. But that embeds, or incorporates, any biases of that particular doctor into the ground truth of a data set. It would be much better to label a medical scan with the result of a PCR test rather than one doctor’s opinion, says Driggs. But there isn’t always time for statistical niceties in busy hospitals.
That hasn’t stopped some of these tools from being rushed into clinical practice. Wynants says it isn’t clear which ones are being used or how. Hospitals will sometimes say that they are using a tool only for research purposes, which makes it hard to assess how much doctors are relying on them. “There’s a lot of secrecy,” she says.
Wynants asked one company that was marketing deep-learning algorithms to share information about its approach but did not hear back. She later found several published models from researchers tied to this company, all of them with a high risk of bias. “We don’t actually know what the company implemented,” she says.
According to Wynants, some hospitals are even signing nondisclosure agreements with medical AI vendors. When she asked doctors what algorithms or software they were using, they sometimes told her they weren’t allowed to say.
How to fix it
What’s the fix? Better data would help, but in times of crisis that’s a big ask. It’s more important to make the most of the data sets we have. The simplest move would be for AI teams to collaborate more with clinicians, says Driggs. Researchers also need to share their models and disclose how they were trained so that others can test them and build on them. “Those are two things we could do today,” he says. “And they would solve maybe 50% of the issues that we identified.”
Getting hold of data would also be easier if formats were standardized, says Bilal Mateen, a doctor who leads research into clinical technology at the Wellcome Trust, a global health research charity based in London.
The Download: a long covid app, and California’s wind plans
1 The Twitter Files weren’t the bombshell Elon Musk billed them as
His carelessness triggered the harassment of some of Twitter’s content moderators, too. (WP $)
+ The files didn’t violate the First Amendment, either. (The Atlantic $)
+ Hate speech has exploded on the platform since he took over. (NYT $)
+ Journalists are staying on Twitter—for now. (Vox)
+ The company’s advertising revenue isn’t looking very healthy. (NYT $)
2 Russia is trying to freeze Ukrainians by destroying their electricity
It’s the country’s vulnerable who will suffer the most. (Economist $)
+ How Ukraine could keep the lights on. (MIT Technology Review)
3 Crypto is at a crossroads
Investors, executives, and advocates are unsure what’s next. (NYT $)
+ FTX and the Alameda Research trading firm were way too close. (FT $)
+ It’s okay to opt out of the crypto revolution. (MIT Technology Review)
4 Taylor Swift fans are suing Ticketmaster
They’re furious they weren’t able to buy tickets in the botched sale last month. (The Verge)
6 We need a global deal to safeguard the natural world
COP15, held this week in Montreal, is our best bet to thrash one out. (Vox)
+ Off-grid living is more viable these days than you may think. (The Verge)
7 What ultra-dim galaxies can teach us about dark matter
We’re going to need new telescopes to seek more of them out. (Wired $)
+ Japanese billionaire Yusaku Maezawa has some big plans for space. (Reuters)
+ A super-bright satellite could hamper our understanding of the cosmos. (Motherboard)
+ Here’s how to watch Mars disappear behind the moon. (New Scientist $)
8 An elite media newsletter wants to cover “power, money, and ego.”
It promises unparalleled access to prolific writers—and their audiences. (New Yorker $)
+ How to sign off an email sensibly. (Economist $)
9 The metaverse has a passion for fashion 👗
Here’s what its best-dressed residents are wearing. (WSJ $)
10 We’ve been sending text messages for 30 years 💬
Yet we’re still misunderstanding each other. (The Guardian)
Quote of the day
“There is certainly a rising sense of fear, justifiable fear. And I would say almost horror.”
—Pamela Nadell, director of American University’s Jewish Studies program, tells the Washington Post she fears that antisemitism has become normalized in the US, in the light of Kanye West’s recent comments praising Hitler.
The big story
California’s coming offshore wind boom faces big engineering hurdles
Research groups estimate that the costs could fall from around $200 per megawatt-hour to between $58 and $120 by 2030. That would leave floating offshore wind more expensive than solar and onshore wind, but it could still serve an important role in an overall energy portfolio.
The technology is improving as well. Turbines themselves continue to get taller, generating more electricity and revenue from any given site. Some research groups and companies are also developing new types of floating platforms and delivery mechanisms that could make it easier to work within the constraints of ports and bridges.
The Denmark-based company Stiesdal has developed a modular, floating platform with a keel that doesn’t drop into place until it’s in the deep ocean, enabling it to be towed out from relatively shallow ports.
Meanwhile, San Francisco startup Aikido Technologies is developing a way of shipping turbines horizontally and then upending them in the deep ocean, enabling the structures to duck under bridges en route. The company believes its designs provide enough clearance for developers to access any US port. Some 80% of these ports have height limits owing to bridges or airport restrictions.
A number of federal, state, and local organizations are conducting evaluations of California and other US ports, assessing which ones might be best positioned to serve floating wind projects and what upgrades could be required to make it possible.
Government policies in the US, the European Union, China, and elsewhere are also providing incentives to develop offshore wind turbines, domestic manufacturing, and supporting infrastructure. That includes the Inflation Reduction Act that Biden signed into law this summer.
Finally, as for California’s permitting challenges, Hochschild notes that the same 2021 law requiring the state’s energy commision to set offshore wind goals also requires it to undertake the long-term planning necessary to meet them. That includes mapping out a strategy for streamlining the approval process.
For all the promise of floating wind, there’s little question that ensuring it’s cost-competitive and achieving the targets envisioned will require making massive investments in infrastructure, manufacturing, and more, and building big projects at a pace that the state hasn’t shown itself capable of in the recent past.
If it can pull it off, however, California could become a leading player in a critical new clean energy sector, harnessing its vast coastal resources to meet its ambitious climate goals.
How Twitter’s “Teacher Li” became the central hub of China protest information
It’s hard to describe the feeling that came after. It’s like everyone is coming to you and all kinds of information from all over the world is converging toward you and [people are] telling you: Hey, what’s happening here; hey, what’s happening there; do you know, this is what’s happening in Guangzhou; I’m in Wuhan, Wuhan is doing this; I’m in Beijing, and I’m following the big group and walking together. Suddenly all the real-time information is being submitted to me, and I don’t know how to describe that feeling. But there was also no time to think about it.
My heart was beating very fast, and my hands and my brain were constantly switching between several software programs—because you know, you can’t save a video with Twitter’s web version. So I was constantly switching software, editing the video, exporting it, and then posting it on Twitter. [Editor’s note: Li adds subtitles, blocks out account information, and compiles shorter videos into one.] By the end, there was no time to edit the videos anymore. If someone shot and sent over a 12-second WeChat video, I would just use it as is. That’s it.
I got the largest amount of [private messages] around 6:00 p.m. on Sunday night. At that time, there were many people on the street in five major cities in China: Beijing, Shanghai, Chengdu, Wuhan, and Guangzhou. So I basically was receiving a dozen private messages every second. In the end, I couldn’t even screen the information anymore. I saw it, I clicked on it, and if it was worth posting, I posted it.
People all over the country are telling me about their real-time situations. In order for more people not to be in danger, they went to the [protest] sites themselves and sent me what was going on there. Like, some followers were riding bikes near the presidential palace in Nanjing, taking pictures, and telling me about the situation in the city. And then they asked me to inform everyone to be cautious. I think that’s a really moving thing.
It’s like I have gradually become an anchor sitting in a TV studio, getting endless information from reporters on the scene all over the country. For example, on Monday in Hangzhou, there were five or six people updating me on the latest news simultaneously. But there was a break because all of them were fleeing when the police cleared the venue.
On the importance of staying objective
There are a lot of tweets that embellish the truth. From their point of view, they think it’s the right thing to do. They think you have to maximize the outrage so that there can be a revolt. But for me, I think we need reliable information. We need to know what’s really going on, and that’s the most important thing. If we were doing it for the emotion, then in the end I really would have been part of the “foreign influence,” right?
But if there is a news account outside China that can record what’s happening objectively, in real time, and accurately, then people inside the Great Firewall won’t have doubts anymore. At this moment, in this quite extreme situation of a continuous news blackout, to be able to have an account that can keep posting news from all over the country at a speed of almost one tweet every few seconds is actually a morale boost for everyone.
Chinese people grow up with patriotism, so they become shy or don’t dare to say something directly or oppose something directly. That’s why the crowd was singing the national anthem and waving the red flag, the national flag [during protests]. You have to understand that the Chinese people are patriotic. Even when they are demanding things [from the government], they do it with that sentiment.