N3C, on the other hand, is auditable by, and accountable to, thousands of researchers at hundreds of participating institutions, with a strong focus on transparency and reproducibility. Everything users do within the interface, which uses Palantir’s GovCloud platform, is carefully preserved, so anyone with access can retrace their steps.
“This isn’t rocket science, and it isn’t really new. It’s just hard work. It’s tedious, it has to be done carefully, and we have to validate every step,” says Christopher Chute, a professor of medicine at Johns Hopkins who also co-leads N3C. “The worst thing we could do is methodically transform data into garbage that would give us wrong answers.”
Haendel points out that these efforts haven’t come easy. “The diversity in expertise that it took to make this happen, the perseverance, dedication, and, frankly, brute force, is just unprecedented,” she says.
That brute force has come from many different fields, many of them not traditionally part of medical research.
“Having everyone on board from all aspects of science really helped. During covid people were much more willing to collaborate,” says Mary Boland, a professor of informatics at the University of Pennsylvania. “You could have engineers, you could have computer scientists, physicists, all these people who might not normally participate in public health research.”
Boland is part of a group using the N3C data to look for whether covid increases irregular bleeding in women with polycystic ovarian syndrome. Outside of covid, most researchers have to use insurance claims data to get a large enough database for population-level analyses, she says.
Claims data can answer some questions about how well drugs work in the real world, for instance. But those databases are missing huge amounts of information, including lab results, what symptoms people are reporting, and even whether patients die.
Collecting and cleaning
Outside of insurance claims databases, most health data collaboratives in the US use a federated model. Participants in these studies all agree to format their own datasets in a common format, and then run queries from the collective, such as the proportion of serious covid cases by age group. Several international covid research collectives, including the Observational Health Data Sciences and Informatics (OHDSI, pronounced “Odyssey”), operate this way, avoiding legal and political problems with cross-border patient data.
OHDSI, which was founded in 2014, has researchers from 30 countries, who together hold records for 600 million patients.
“That allows each institution to keep their data behind their own firewalls, with their own data protections in place. It doesn’t require any patient data to move back and forth,” says Boland. “That’s comforting for a lot of places, especially with all the hacking that’s been going on lately.”
How do I know if egg freezing is for me?
The tool is currently being trialed in a group of research volunteers and is not yet widely available. But I’m hoping it represents a move toward more transparency and openness about the real costs and benefits of egg freezing. Yes, it is a remarkable technology that can help people become parents. But it might not be the best option for everyone.
Read more from Tech Review’s archive
Anna Louie Sussman had her eggs frozen in Italy and Spain because services in New York were too expensive. Luckily, there are specialized couriers ready to take frozen sex cells on international journeys, she wrote.
Michele Harrison was 41 when she froze 21 of her eggs. By the time she wanted to use them, two years later, only one was viable. Although she did have a baby, her case demonstrates that egg freezing is no guarantee of parenthood, wrote Bonnie Rochman.
What happens if someone dies with eggs in storage? Frozen eggs and sperm can still be used to create new life, but it’s tricky to work out who can make the decision, as I wrote in a previous edition of The Checkup.
Meanwhile, the race is on to create lab-made eggs and sperm. These cells, which might be made from a person’s blood or skin cells, could potentially solve a lot of fertility problems—should they ever prove safe, as I wrote in a feature for last year’s magazine issue on gender.
Researchers are also working on ways to mature eggs from transgender men in the lab, which could allow them to store and use their eggs without having to pause gender-affirming medical care or go through other potentially distressing procedures, as I wrote last year.
From around the web
The World Health Organization is set to decide whether covid still represents a “public health emergency of international concern.” It will probably decide to keep this status, because of the current outbreak in China. (STAT)
Researchers want to study the brains, genes, and other biological features of incarcerated people to find ways to stop them from reoffending. Others warn that this approach is based on shoddy science and racist ideas. (Undark)
A watermark for chatbots can expose text written by an AI
For example, since OpenAI’s chatbot ChatGPT was launched in November, students have already started cheating by using it to write essays for them. News website CNET has used ChatGPT to write articles, only to have to issue corrections amid accusations of plagiarism. Building the watermarking approach into such systems before they’re released could help address such problems.
In studies, these watermarks have already been used to identify AI-generated text with near certainty. Researchers at the University of Maryland, for example, were able to spot text created by Meta’s open-source language model, OPT-6.7B, using a detection algorithm they built. The work is described in a paper that’s yet to be peer-reviewed, and the code will be available for free around February 15.
AI language models work by predicting and generating one word at a time. After each word, the watermarking algorithm randomly divides the language model’s vocabulary into words on a “greenlist” and a “redlist” and then prompts the model to choose words on the greenlist.
The more greenlisted words in a passage, the more likely it is that the text was generated by a machine. Text written by a person tends to contain a more random mix of words. For example, for the word “beautiful,” the watermarking algorithm could classify the word “flower” as green and “orchid” as red. The AI model with the watermarking algorithm would be more likely to use the word “flower” than “orchid,” explains Tom Goldstein, an assistant professor at the University of Maryland, who was involved in the research.
The Download: watermarking AI text, and freezing eggs
That’s why the team behind a new decision-making tool hope it will help to clear up some of the misconceptions around the procedure—and give would-be parents a much-needed insight into its real costs, benefits, and potential pitfalls. Read the full story.
This story is from The Checkup, MIT Technology Review’s weekly newsletter giving you the inside track on all things health and biotech. Sign up to receive it in your inbox every Thursday.
I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.
1 Elon Musk held a surprise meeting with US political leaders
Allegedly in the interest of ensuring Twitter is “fair to both parties.” (Insider $)
+ Kanye West’s presidential campaign advisors have been booted off Twitter. (Rolling Stone $)
+ Twitter’s trust and safety head is Musk’s biggest champion. (Bloomberg $)
2 We’re treating covid like flu now
Annual covid shots are the next logical step. (The Atlantic $)
3 The worst thing about Sam Bankman-Fried’s spell in jail?
Being cut off from the internet. (Forbes $)
+ Most crypto criminals use just five exchanges. (Wired $)
+ Collapsed crypto firmFTX has objected to a new investigation request. (Reuters)
4 Israel’s tech sector is rising up against its government
Tech workers fear its hardline policies will harm startups. (FT $)
5 It’s possible to power the world solely using renewable energy
At least, according to Stanford academic Mark Jacobson. (The Guardian)
+ Tech bros love the environment these days. (Slate $)
+ How new versions of solar, wind, and batteries could help the grid. (MIT Technology Review)
6 Generative AI is wildly expensive to run
And that’s why promising startups like OpenAI need to hitch their wagons to the likes of Microsoft. (Bloomberg $)
+ How Microsoft benefits from the ChatGPT hype. (Vox)
+ BuzzFeed is planning to make quizzes supercharged by OpenAI. (WSJ $)
+ Generative AI is changing everything. But what’s left when the hype is gone? (MIT Technology Review)
7 It’s hard not to blame self-driving cars for accidents
Even when it’s not technically their fault. (WSJ $)
8 What it’s like to swap Google for TikTok
It’s great for food suggestions and hacks, but hopeless for anything work-related. (Wired $)
+ The platform really wants to stay operational in the US. (Vox)
+ TikTok is mired in an eyelash controversy. (Rolling Stone $)
9 CRISPR gene editing kits are available to buy online
But there’s no guarantee these experiments will actually work. (Motherboard)
+ Next up for CRISPR: Gene editing for the masses? (MIT Technology Review)
10 Tech workers are livestreaming their layoffs
It’s a candid window into how these notoriously secretive companies treat their staff. (The Information $)