Yes, but: In recent years, studies have found that these data sets can contain serious flaws. ImageNet, for example, contains racist and sexist labels as well as photos of people’s faces obtained without consent. The latest study now looks at another problem: many of the labels are just flat-out wrong. A mushroom is labeled a spoon, a frog is labeled a cat, and a high note from Ariana Grande is labeled a whistle. The ImageNet test set has an estimated label error rate of 5.8%. Meanwhile, the test set for QuickDraw, a compilation of hand drawings, has an estimated error rate of 10.1%.
How was it measured? Each of the 10 data sets used for evaluating models has a corresponding data set used for training them. The researchers, MIT graduate students Curtis G. Northcutt and Anish Athalye and alum Jonas Mueller, used the training data sets to develop a machine-learning model and then used it to predict the labels in the testing data. If the model disagreed with the original label, the data point was flagged up for manual review. Five human reviewers on Amazon Mechanical Turk were asked to vote on which label—the model’s or the original—they thought was correct. If the majority of the human reviewers agreed with the model, the original label was tallied as an error and then corrected.
Does this matter? Yes. The researchers looked at 34 models whose performance had previously been measured against the ImageNet test set. Then they remeasured each model against the roughly 1,500 examples where the data labels were found to be wrong. They found that the models that didn’t perform so well on the original incorrect labels were some of the best performers after the labels were corrected. In particular, the simpler models seemed to fare better on the corrected data than the more complicated models that are used by tech giants like Google for image recognition and assumed to be the best in the field. In other words, we may have an inflated sense of how great these complicated models are because of flawed testing data.
Now what? Northcutt encourages the AI field to create cleaner data sets for evaluating models and tracking the field’s progress. He also recommends that researchers improve their data hygiene when working with their own data. Otherwise, he says, “if you have a noisy data set and a bunch of models you’re trying out, and you’re going to deploy them in the real world,” you could end up selecting the wrong model. To this end, he open-sourced the code he used in his study for correcting label errors, which he says is already in use at a few major tech companies.
A bot that watched 70,000 hours of Minecraft could unlock AI’s next big thing
The researchers claim that their approach could be used to train AI to carry out other tasks. To begin with, it could be used to for bots that use a keyboard and mouse to navigate websites, book flights or buy groceries online. But in theory it could be used to train robots to carry out physical, real-world tasks by copying first-person video of people doing those things. “It’s plausible,” says Stone.
Matthew Gudzial at the University of Alberta, Canada, who has used videos to teach AI the rules of games like Super Mario Bros, does not think it will happen any time soon, however. Actions in games like Minecraft and Super Mario Bros. are performed by pressing buttons. Actions in the physical world are far more complicated and harder for a machine to learn. “It unlocks a whole mess of new research problems,” says Gudzial.
“This work is another testament to the power of scaling up models and training on massive datasets to get good performance,” says Natasha Jaques, who works on multi-agent reinforcement learning at Google and the University of California, Berkeley.
Large internet-sized data sets will certainly unlock new capabilities for AI, says Jaques. “We’ve seen that over and over again, and it’s a great approach.” But OpenAI places a lot of faith in the power of large data sets alone, she says: “Personally, I’m a little more skeptical that data can solve any problem.”
Still, Baker and his colleagues think that collecting more than a million hours of Minecraft videos will make their AI even better. It’s probably the best Minecraft-playing bot yet, says Baker: “But with more data and bigger models I would expect it to feel like you’re watching a human playing the game, as opposed to a baby AI trying to mimic a human.”
The Download: AI conquers Minecraft, and babies after death
+ Scientists have found a way to mature eggs from transgender men in the lab. It could offer them new ways to start a family—without the need for distressing IVF procedures. Read the full story. + How reproductive technology is changing what it means to be a parent. Advances could lead to babies with four or more biological parents—forcing us to reconsider parenthood. Read the full story.
I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.
1 Elon Musk wants to reinstate banned Twitter accounts
It’s an incredibly dangerous decision with widespread repercussions. (WP $)
+ Recent departures have hit Twitter’s policy and safety divisions hard. (WSJ $)
+ It looks like Musk’s promise of no further layoffs was premature. (Insider $)
+ Meanwhile, Twitter Blue is still reportedly launching next week. (Reuters)
+ Imagine simply transferring your followers to another platform. (FT $)
+ Twitter’s potential collapse could wipe out vast records of recent human history. (MIT Technology Review)
2 Russia’s energy withdrawal could kill tens of thousands in Europe
High fuel costs could result in more deaths this winter than the war in Ukraine. (Economist $)
+ Higher gas prices will also hit Americans as the weather worsens. (Vox)
+ Ukraine’s invasion underscores Europe’s deep reliance on Russian fossil fuels. (MIT Technology Review)
3 FTX is unable to honor the grants it promised various organizations
Many of them are having to seek emergency funding to plug the gaps. (WSJ $)
+ Bahamians aren’t thrilled about what its collapse could mean for them. (WP $)
5 The UK is curbing its use of Chinese surveillance systems
But only on “sensitive” government sites. (FT $)
+ The world’s biggest surveillance company you’ve never heard of. (MIT Technology Review)
7 San Francisco’s police is considering letting robots use deadly force
The force has 12 remotely piloted robots that could, in theory, kill someone. (The Verge)
8 Human hibernation could be the key to getting us to Mars
It could be the closest we can get to time travel. (Wired $)
9 Why TikTok is suddenly so obsessed with dabloons
It’s a form of choose-your-own-adventure fun. (The Guardian)
10 We can’t stop trying to reinvent mousetraps 🧀
There are thousands of versions out there, yet we keep coming up with new designs. (New Yorker $)
We can now use cells from dead people to create new life. But who gets to decide?
His parents told a court that they wanted to keep the possibility of using the sperm to eventually have children that would be genetically related to Peter. The court approved their wishes, and Peter’s sperm was retrieved from his body and stored in a local sperm bank.
We have the technology to use sperm, and potentially eggs, from dead people to make embryos, and eventually babies. And there are millions of eggs and embryos—and even more sperm—in storage and ready to be used. When the person who provided those cells dies, like Peter, who gets to decide what to do with them?
That was the question raised at an online event held by the Progress Educational Trust, a UK charity for people with infertility and genetic conditions, that I attended on Wednesday. The panel included a clinician and two lawyers, who addressed plenty of tricky questions, but provided few concrete answers.
In theory, the decision should be made by the person who provided the eggs, sperm or embryos. In some cases, the person’s wishes might be quite clear. Someone who might be trying for a baby with their partner may store their sex cells or embryos and sign a form stating that they are happy for their partner to use these cells if they die, for example.
But in other cases, it’s less clear. Partners and family members who want to use the cells might have to collect evidence to convince a court the deceased person really did want to have children. And not only that, but that they wanted to continue their family line without necessarily becoming a parent themselves.
Sex cells and embryos aren’t property—they don’t fall under property law and can’t be inherited by family members. But there is some degree of legal ownership for the people who provided the cells. It is complicated to define that ownership, however, Robert Gilmour, a family law specialist based in Scotland, said at the event. “The law in this area makes my head hurt,” he said.
The law varies depending on where you are, too. Posthumous reproduction is not allowed in some countries, and is unregulated in many others. In the US, laws vary by state. Some states won’t legally recognize a child conceived after a person’s death as that person’s offspring, according to the American Society for Reproductive Medicine (ASRM). “We do not have any national rules or policies,” Gwendolyn Quinn, a bioethicist at New York University, tells me.
Societies like ASRM have put together guidance for clinics in the meantime. But this can also vary slightly between regions. Guidance by the European Society for Human Reproduction and Embryology, for example, recommends that parents and other relatives should not be able to request the sex cells or embryos of the person who died. That would apply to Peter Zhu’s parents. The concern is that these relatives might be hoping for a “commemorative child” or as “a symbolic replacement of the deceased.”