Among other things, this is what Gebru, Mitchell, and five other scientists warned about in their paper, which calls LLMs “stochastic parrots.” “Language technology can be very, very useful when it is appropriately scoped and situated and framed,” says Emily Bender, a professor of linguistics at the University of Washington and one of the coauthors of the paper. But the general-purpose nature of LLMs—and the persuasiveness of their mimicry—entices companies to use them in areas they aren’t necessarily equipped for.
In a recent keynote at one of the largest AI conferences, Gebru tied this hasty deployment of LLMs to consequences she’d experienced in her own life. Gebru was born and raised in Ethiopia, where an escalating war has ravaged the northernmost Tigray region. Ethiopia is also a country where 86 languages are spoken, nearly all of them unaccounted for in mainstream language technologies.
Despite LLMs having these linguistic deficiencies, Facebook relies heavily on them to automate its content moderation globally. When the war in Tigray first broke out in November, Gebru saw the platform flounder to get a handle on the flurry of misinformation. This is emblematic of a persistent pattern that researchers have observed in content moderation. Communities that speak languages not prioritized by Silicon Valley suffer the most hostile digital environments.
Gebru noted that this isn’t where the harm ends, either. When fake news, hate speech, and even death threats aren’t moderated out, they are then scraped as training data to build the next generation of LLMs. And those models, parroting back what they’re trained on, end up regurgitating these toxic linguistic patterns on the internet.
In many cases, researchers haven’t investigated thoroughly enough to know how this toxicity might manifest in downstream applications. But some scholarship does exist. In her 2018 book Algorithms of Oppression, Safiya Noble, an associate professor of information and African-American studies at the University of California, Los Angeles, documented how biases embedded in Google search perpetuate racism and, in extreme cases, perhaps even motivate racial violence.
“The consequences are pretty severe and significant,” she says. Google isn’t just the primary knowledge portal for average citizens. It also provides the information infrastructure for institutions, universities, and state and federal governments.
Google already uses an LLM to optimize some of its search results. With its latest announcement of LaMDA and a recent proposal it published in a preprint paper, the company has made clear it will only increase its reliance on the technology. Noble worries this could make the problems she uncovered even worse: “The fact that Google’s ethical AI team was fired for raising very important questions about the racist and sexist patterns of discrimination embedded in large language models should have been a wake-up call.”
The BigScience project began in direct response to the growing need for scientific scrutiny of LLMs. In observing the technology’s rapid proliferation and Google’s attempted censorship of Gebru and Mitchell, Wolf and several colleagues realized it was time for the research community to take matters into its own hands.
Inspired by open scientific collaborations like CERN in particle physics, they conceived of an idea for an open-source LLM that could be used to conduct critical research independent of any company. In April of this year, the group received a grant to build it using the French government’s supercomputer.
At tech companies, LLMs are often built by only half a dozen people who have primarily technical expertise. BigScience wanted to bring in hundreds of researchers from a broad range of countries and disciplines to participate in a truly collaborative model-construction process. Wolf, who is French, first approached the French NLP community. From there, the initiative snowballed into a global operation encompassing more than 500 people.
The collaborative is now loosely organized into a dozen working groups and counting, each tackling different aspects of model development and investigation. One group will measure the model’s environmental impact, including the carbon footprint of training and running the LLM and factoring in the life-cycle costs of the supercomputer. Another will focus on developing responsible ways of sourcing the training data—seeking alternatives to simply scraping data from the web, such as transcribing historical radio archives or podcasts. The goal here is to avoid toxic language and nonconsensual collection of private information.
Climate tech is back—and this time, it can’t afford to fail
Boston Metal’s strategy is to try to make the transition as digestible as possible for steelmakers. “We won’t own and operate steel plants,” says Adam Rauwerdink, who heads business development at the company. Instead, it plans to license the technology for electrochemical units that are designed to be a simple drop-in replacement for blast furnaces; the liquid iron that flows out of the electrochemical cells can be handled just as if it were coming out of a blast furnace, with the same equipment.
Working with industrial investors including ArcelorMittal, says Rauwerdink, allows the startup to learn “how to integrate our technology into their plants—how to handle the raw materials coming in, the metal products coming out of our systems, and how to integrate downstream into their established processes.”
The startup’s headquarters in a business park about 15 miles outside Boston is far from any steel manufacturing, but these days it’s drawing frequent visitors from the industry. There, the startup’s pilot-scale electrochemical unit, the size of a large furnace, is intentionally designed to be familiar to those potential customers. If you ignore the hordes of electrical cables running in and out of it, and the boxes of electric equipment surrounding it, it’s easy to forget that the unit is not just another part of the standard steelmaking process. And that’s exactly what Boston Metal is hoping for.
The company expects to have an industrial-scale unit ready for use by 2025 or 2026. The deadline is key, because Boston Metal is counting on commitments that many large steelmakers have made to reach zero carbon emissions by 2050. Given that the life of an average blast furnace is around 20 years, that means having the technology ready to license before 2030, as steelmakers plan their long-term capital expenditures. But even now, says Rauwerdink, demand is growing for green steel, especially in Europe, where it’s selling for a few hundred dollars a metric ton more than the conventional product.
It’s that kind of blossoming market for clean technologies that many of today’s startups are depending on. The recent corporate commitments to decarbonize, and the IRA and other federal spending initiatives, are creating significant demand in markets “that previously didn’t exist,” says Michael Kearney, a partner at Engine Ventures.
One wild card, however, will be just how aggressively and faithfully corporations pursue ways to transform their core businesses and to meet their publicly stated goals. Funding a small pilot-scale project, says Kearney, “looks more like greenwashing if you have no intention of scaling those projects.” Watching which companies move from pilot plants to full-scale commercial facilities will tell you “who’s really serious,” he says. Putting aside the fears of greenwashing, Kearney says it’s essential to engage these large corporations in the transition to cleaner technologies.
Susan Schofer, a partner at the venture firm SOSV, has some advice for those VCs and startups reluctant to work with existing companies in traditionally heavily polluting industries: Get over it. “We need to partner with them. These incumbents have important knowledge that we all need to get in order to effect change. So there needs to be healthy respect on both sides,” she says. Too often, she says, there is “an attitude that we don’t want to do that because it’s helping an incumbent industry.” But the reality, she says, is that finding ways for such industries to save energy or use cleaner technologies “can make the biggest difference in the near term.”
It’s tempting to dismiss the history of cleantech 1.0. It was more than a decade ago, and there’s a new generation of startups and investors. Far more money is around today, along with a broader range of financing options. Surely we’re savvier these days.
Making an image with generative AI uses as much energy as charging your phone
“If you’re doing a specific application, like searching through email … do you really need these big models that are capable of anything? I would say no,” Luccioni says.
The energy consumption associated with using AI tools has been a missing piece in understanding their true carbon footprint, says Jesse Dodge, a research scientist at the Allen Institute for AI, who was not part of the study.
Comparing the carbon emissions from newer, larger generative models and older AI models is also important, Dodge adds. “It highlights this idea that the new wave of AI systems are much more carbon intensive than what we had even two or five years ago,” he says.
Google once estimated that an average online search used 0.3 watt-hours of electricity, equivalent to driving 0.0003 miles in a car. Today, that number is likely much higher, because Google has integrated generative AI models into its search, says Vijay Gadepally, a research scientist at the MIT Lincoln lab, who did not participate in the research.
Not only did the researchers find emissions for each task to be much higher than they expected, but they discovered that the day-to-day emissions associated with using AI far exceeded the emissions from training large models. Luccioni tested different versions of Hugging Face’s multilingual AI model BLOOM to see how many uses would be needed to overtake training costs. It took over 590 million uses to reach the carbon cost of training its biggest model. For very popular models, such as ChatGPT, it could take just a couple of weeks for such a model’s usage emissions to exceed its training emissions, Luccioni says.
This is because large AI models get trained just once, but then they can be used billions of times. According to some estimates, popular models such as ChatGPT have up to 10 million users a day, many of whom prompt the model more than once.
Studies like these make the energy consumption and emissions related to AI more tangible and help raise awareness that there is a carbon footprint associated with using AI, says Gadepally, adding, “I would love it if this became something that consumers started to ask about.”
Dodge says he hopes studies like this will help us to hold companies more accountable about their energy usage and emissions.
“The responsibility here lies with a company that is creating the models and is earning a profit off of them,” he says.
The first CRISPR cure might kickstart the next big patent battle
And really, what’s the point of such a hard-won triumph unless it’s to enforce your rights? “Honestly, this train has been coming down the track since at least 2014, if not earlier. We’re at the collision point. I struggle to imagine there’s going to be a diversion,” says Sherkow. “Brace for impact.”
The Broad Institute didn’t answer any of my questions, and a spokesperson for MIT didn’t even reply to my email. That’s not a surprise. Private universities can be exceedingly obtuse when it comes to acknowledging their commercial activities. They are supposed to be centers of free inquiry and humanitarian intentions, so if employees get rich from biotechnology—and they do—they try to do it discreetly.
There are also strong reasons not to sue. Suing could make a nonprofit like the Broad Institute look bad. Really bad. That’s because it could get in the way of cures.
“It seems unlikely and undesirable, [as] legal challenges at this late date would delay saving patients,” says George Church, a Harvard professor and one of the original scientific founders of Editas, though he’s no longer closely involved with the company.
If a patent infringement lawsuit does get filed, it will happen sometime after Vertex notifies regulators it’s starting to sell the treatment. “That’s the starting gun,” says Sherkow. “There are no hypothetical lawsuits in the patent system, so one must wait until it’s sufficiently clear that an act of infringement is about to occur.”
How much money is at stake? It remains unclear what the demand for the Vertex treatment will be, but it could eventually prove a blockbuster. There are about 20,000 people with severe sickle-cell in the US who might benefit. And assuming a price of $3 million (my educated guess), that’s a total potential market of around $60 billion. A patent holder could potentially demand 10% of the take, or more.
Vertex can certainly defend itself. It’s a big, rich company, and through its partnership with the Swiss firm CRISPR Therapeutics, a biotech co-founded by Charpentier, Vertex has access to the competing set of intellectual-property claims—including those of UC Berkeley, which (though bested by Broad in the US) hold force in Europe and could be used to throw up a thicket of counterarguments.
Vertex could also choose to pay royalties. To do that, it would have to approach Editas, the biotech cofounded by Zhang and Church in Cambridge, Massachusetts, which previously bought exclusive rights to the Broad patents on CRISPR in the arena of human treatments, including sickle-cell therapies.