The Facebook whistleblower says its algorithms are dangerous. Here’s why.
In her testimony, Haugen also repeatedly emphasized how these phenomena are far worse in regions that don’t speak English because of Facebook’s uneven coverage of different languages.
“In the case of Ethiopia there are 100 million people and six languages. Facebook only supports two of those languages for integrity systems,” she said. “This strategy of focusing on language-specific, content-specific systems for AI to save us is doomed to fail.”
She continued: “So investing in non-content-based ways to slow the platform down not only protects our freedom of speech, it protects people’s lives.”
I explore this more in a different article from earlier this year on the limitations of large language models, or LLMs:
Despite LLMs having these linguistic deficiencies, Facebook relies heavily on them to automate its content moderation globally. When the war in Tigray[, Ethiopia] first broke out in November, [AI ethics researcher Timnit] Gebru saw the platform flounder to get a handle on the flurry of misinformation. This is emblematic of a persistent pattern that researchers have observed in content moderation. Communities that speak languages not prioritized by Silicon Valley suffer the most hostile digital environments.
Gebru noted that this isn’t where the harm ends, either. When fake news, hate speech, and even death threats aren’t moderated out, they are then scraped as training data to build the next generation of LLMs. And those models, parroting back what they’re trained on, end up regurgitating these toxic linguistic patterns on the internet.
How does Facebook’s content ranking relate to teen mental health?
One of the more shocking revelations from the Journal’s Facebook Files was Instagram’s internal research, which found that its platform is worsening mental health among teenage girls. “Thirty-two percent of teen girls said that when they felt bad about their bodies, Instagram made them feel worse,” researchers wrote in a slide presentation from March 2020.
Haugen connects this phenomenon to engagement-based ranking systems as well, which she told the Senate today “is causing teenagers to be exposed to more anorexia content.”
“If Instagram is such a positive force, have we seen a golden age of teenage mental health in the last 10 years? No, we have seen escalating rates of suicide and depression amongst teenagers,” she continued. “There’s a broad swath of research that supports the idea that the usage of social media amplifies the risk of these mental health harms.”
In my own reporting, I heard from a former AI researcher who also saw this effect extend to Facebook.
The researcher’s team…found that users with a tendency to post or engage with melancholy content—a possible sign of depression—could easily spiral into consuming increasingly negative material that risked further worsening their mental health.
But as with Haugen, the researcher found that leadership wasn’t interested in making fundamental algorithmic changes.
The team proposed tweaking the content-ranking models for these users to stop maximizing engagement alone, so they would be shown less of the depressing stuff. “The question for leadership was: Should we be optimizing for engagement if you find that somebody is in a vulnerable state of mind?” he remembers.
But anything that reduced engagement, even for reasons such as not exacerbating someone’s depression, led to a lot of hemming and hawing among leadership. With their performance reviews and salaries tied to the successful completion of projects, employees quickly learned to drop those that received pushback and continue working on those dictated from the top down….
That former employee, meanwhile, no longer lets his daughter use Facebook.
How do we fix this?
Haugen is against breaking up Facebook or repealing Section 230 of the US Communications Decency Act, which protects tech platforms from taking responsibility for the content it distributes.
Instead, she recommends carving out a more targeted exemption in Section 230 for algorithmic ranking, which she argues would “get rid of the engagement-based ranking.” She also advocates for a return to Facebook’s chronological news feed.
Newly revealed coronavirus data has reignited a debate over the virus’s origins
Data collected in 2020—and kept from public view since then—potentially adds weight to the animal theory. It highlights a potential suspect: the raccoon dog. But exactly how much weight it adds depends on who you ask. New analyses of the data have only reignited the debate, and stirred up some serious drama.
The current ruckus starts with a study shared by Chinese scientists back in February 2022. In a preprint (a scientific paper that has not yet been peer-reviewed or published in a journal), George Gao of the Chinese Center for Disease Control and Prevention (CCDC) and his colleagues described how they collected and analyzed 1,380 samples from the Huanan Seafood Market.
These samples were collected between January and March 2020, just after the market was closed. At the time, the team wrote that they only found coronavirus in samples alongside genetic material from people.
There were a lot of animals on sale at this market, which sold more than just seafood. The Gao paper features a long list, including chickens, ducks, geese, pheasants, doves, deer, badgers, rabbits, bamboo rats, porcupines, hedgehogs, crocodiles, snakes, and salamanders. And that list is not exhaustive—there are reports of other animals being traded there, including raccoon dogs. We’ll come back to them later.
But Gao and his colleagues reported that they didn’t find the coronavirus in any of the 18 species of animal they looked at. They suggested that it was humans who most likely brought the virus to the market, which ended up being the first known epicenter of the outbreak.
Fast-forward to March 2023. On March 4, Florence Débarre, an evolutionary biologist at Sorbonne University in Paris, spotted some data that had been uploaded to GISAID, a website that allows researchers to share genetic data to help them study and track viruses that cause infectious diseases. The data appeared to have been uploaded in June 2022. It seemed to have been collected by Gao and his colleagues for their February 2022 study, although it had not been included in the actual paper.
Fostering innovation through a culture of curiosity
And so I think a big part of it as a company, by setting these ambitious goals, it forces us to say if we want to be number one, if we want to be top tier in these areas, if we want to continue to generate results, how do we get there using technology? And so that really forces us to throw away our assumptions because you can’t follow somebody, if you want to be number one you can’t follow someone to become number one. And so we understand that the path to get there, it’s through, of course, technology and the software and the enablement and the investment, but it really is by becoming goal-oriented. And if we look at these examples of how do we create the infrastructure on the technology side to support these ambitious goals, we ourselves have to be ambitious in turn because if we bring a solution that’s also a me too, that’s a copycat, that doesn’t have differentiation, that’s not going to propel us, for example, to be a top 10 supply chain. It just doesn’t pass muster.
So I think at the top level, it starts with the business ambition. And then from there we can organize ourselves at the intersection of the business ambition and the technology trends to have those very rich discussions and being the glue of how do we put together so many moving pieces because we’re constantly scanning the technology landscape for new advancing and emerging technologies that can come in and be a part of achieving that mission. And so that’s how we set it up on the process side. As an example, I think one of the things, and it’s also innovation, but it doesn’t get talked about as much, but for the community out there, I think it’s going to be very relevant is, how do we stay on top of the data sovereignty questions and data localization? There’s a lot of work that needs to go into rethinking what your cloud, private, public, edge, on-premise look like going forward so that we can remain cutting edge and competitive in each of our markets while meeting the increasing guidance that we’re getting from countries and regulatory agencies about data localization and data sovereignty.
And so in our case, as a global company that’s listed in Hong Kong and we operate all around the world, we’ve had to really think deeply about the architecture of our solutions and apply innovation in how we can architect for a longer term growth, but in a world that’s increasingly uncertain. So I think there’s a lot of drivers in some sense, which is our corporate aspirations, our operating environment, which has continued to have a lot of uncertainty, and that really forces us to take a very sharp lens on what cutting edge looks like. And it’s not always the bright and shiny technology. Cutting edge could mean going to the executive committee and saying, Hey, we’re going to face a challenge about compliance. Here’s the innovation we’re bringing about architecture so that we can handle not just the next country or regulatory regime that we have to comply with, but the next 10, the next 50.
Laurel: Well, and to follow up with a bit more of a specific example, how does R&D help improve manufacturing in the software supply chain as well as emerging technologies like artificial intelligence and the industrial metaverse?
Art: Oh, I love this one because this is the perfect example of there’s a lot happening in the technology industry and there’s so much back to the earlier point of applied curiosity and how we can try this. So specifically around artificial intelligence and industrial metaverse, I think those go really well together with what are Lenovo’s natural strengths. Our heritage is as a leading global manufacturer, and now we’re looking to also transition to services-led, but applying AI and technologies like the metaverse to our factories. I think it’s almost easier to talk about the inverse, Laurel, which is if we… Because, and I remember very clearly we’ve mapped this out, there’s no area within the supply chain and manufacturing that is not touched by these areas. If I think about an example, actually, it’s very timely that we’re having this discussion. Lenovo was recognized just a few weeks ago at the World Economic Forum as part of the global lighthouse network on leading manufacturing.
And that’s based very much on applying around AI and metaverse technologies and embedding them into every aspect of what we do about our own supply chain and manufacturing network. And so if I pick a couple of examples on the quality side within the factory, we’ve implemented a combination of digital twin technology around how we can design to cost, design to quality in ways that are much faster than before, where we can prototype in the digital world where it’s faster and lower cost and correcting errors is more upfront and timely. So we are able to much more quickly iterate on our products. We’re able to have better quality. We’ve taken advanced computer vision so that we’re able to identify quality defects earlier on. We’re able to implement technologies around the industrial metaverse so that we can train our factory workers more effectively and better using aspects of AR and VR.
And we’re also able to, one of the really important parts of running an effective manufacturing operation is actually production planning, because there’s so many thousands of parts that are coming in, and I think everyone who’s listening knows how much uncertainty and volatility there have been in supply chains. So how do you take such a multi-thousand dimensional planning problem and optimize that? Those are things where we apply smart production planning models to keep our factories fully running so that we can meet our customer delivery dates. So I don’t want to drone on, but I think literally the answer was: there is no place, if you think about logistics, planning, production, scheduling, shipping, where we didn’t find AI and metaverse use cases that were able to significantly enhance the way we run our operations. And again, we’re doing this internally and that’s why we’re very proud that the World Economic Forum recognized us as a global lighthouse network manufacturing member.
Laurel: It’s certainly important, especially when we’re bringing together computing and IT environments in this increasing complexity. So as businesses continue to transform and accelerate their transformations, how do you build resiliency throughout Lenovo? Because that is certainly another foundational characteristic that is so necessary.
The Download: covid’s origin drama, and TikTok’s uncertain future
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.
Newly-revealed coronavirus data has reignited a debate over the virus’s origins
This week, we’ve seen the resurgence of a debate that has been swirling since the start of the pandemic—where did the virus that causes covid-19 come from?
For the most part, scientists have maintained that the virus probably jumped from an animal to a human at the Huanan Seafood Market in Wuhan at some point in late 2019. But some claim that the virus leaped from humans to animals, rather than the other way around. And many continue to claim that the virus somehow leaked from a nearby laboratory that was studying coronaviruses in bats.
Data collected in 2020—and kept from public view since then—potentially adds weight to the animal theory. It highlights a potential suspect: the raccoon dog. But exactly how much weight it adds depends on who you ask. Read the full story.
This story is from The Checkup, Jessica’s weekly biotech newsletter. Sign up to receive it in your inbox every Thursday.
Read more of MIT Technology Review’s covid reporting:
+ Our senior biotech editor Antonio Regalado investigated the origins of the coronavirus behind covid-19 in his five-part podcast series Curious Coincidence.
+ Meet the scientist at the center of the covid lab leak controversy. Shi Zhengli has spent years at the Wuhan Institute of Virology researching coronaviruses that live in bats. Her work has come under fire as the world tries to understand where covid-19 came from. Read the full story.
+ This scientist now believes covid started in Wuhan’s wet market. Here’s why. Michael Worobey of the University of Arizona, believes that a spillover of the virus from animals at the Huanan Seafood market was almost certainly behind the origin of the pandemic. Read the full story.
I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.
1 TikTok’s future in the US is hanging in the balance
Banning it is a colossal challenge, and officials still lack the legal authority to do so. (WP $)
+ TikTok CEO Shou Zi Chew was grilled by a congressional committee. (FT $)
+ He told lawmakers the company would earn their trust. (WSJ $)
+ Meanwhile, TikTok paid for influencers to travel to DC to lobby its cause. (Wired $)
2 A crypto fugitive has been arrested in Montenegro
Do Kwon has been on the run since TerraUSD stablecoin collapsed last year. (WSJ $)
+ Want to mine Bitcoin? Get yourself to Texas. (Reuters)
+ What’s next for crypto. (MIT Technology Review)
3 Twitter’s getting rid of its legacy blue checks
On the entirely serious date of April 1. (The Verge)+ The platform’s still an unattractive prospect for advertisers. (Vox)
4 Chatbots are having tough conversations for us
ChatGPT is adept at writing scripts for sensitive talks with kids and colleagues. (NYT $)
+ OpenAI has given ChatGPT access to the web’s live data. (The Verge)
+ How Character.AI became a billion-dollar unicorn. (WSJ $)
+ The inside story of how ChatGPT was built from the people who made it. (MIT Technology Review)
5 Jack Dorsey’s Block has been accused of fraudulent transactions
The payments company denied it, and claims it inflated its users numbers, too.(FT $)
+ Dorsey doesn’t have a track record of caring about this kind of thing. (The Information $)
6 Homeowners associations are secretly installing surveillance systems
The system tracks license plates and follows residents’ movements. (The Intercept)
7 Inside the tricky ethics of using DNA to solve crimes
A new database could help to protect users’ privacy. (Wired $)|
+ The citizen scientist who finds killers from her couch. (MIT Technology Review)
8 There’s plenty of reasons to be optimistic about the climate
Healthier, more sustainable diets are a good place to start. (Scientific American)
+ Taking stock of our climate past, present, and future. (MIT Technology Review)
9 TikTok keeps hectoring us
It seems we just can’t get enough of being aggressively told what to do. (Vox)
10 Don’t get scammed by a deepfake
CallerID can’t be trusted to protect you from rogue AI calls. (Gizmodo)
Quote of the day
“Wait, I need content.”
—TikTok fashion creator Kristine Thompson refuses to miss a content opportunity during a trip to the US Capitol to lobby against a potential TikTok ban, she tells the New York Times.
The big story
This sci-fi blockchain game could help create a metaverse that no one owns
Dark Forest is a vast universe, and most of it is shrouded in darkness. Your mission, should you choose to accept it, is to venture into the unknown, avoid being destroyed by opposing players who may be lurking in the dark, and build an empire of the planets you discover and can make your own.
But while the video game seemingly looks and plays much like other online strategy games, it doesn’t rely on the servers running other popular online strategy games. And it may point to something even more profound: the possibility of a metaverse that isn’t owned by a big tech company. Read the full story.
We can still have nice things
A place for comfort, fun and distraction in these weird times. (Got any ideas? Drop me a line or tweet ’em at me.)
+ If underwater terrors are your thing, Joe Romiero takes some seriously impressive shark pictures and videos.
+ Try as it might, Ted Lasso’s British dialog falls wide of the mark.
+ Let’s have a good old snoop around some celebrities’ bedrooms.
+ Why we can’t get enough of those fancy candles.
+ Interviewing animals with a tiny microphone, it doesn’t get much better than that.