Pineau helped change how research is published in several of the largest conferences, introducing a checklist of things that researchers must submit alongside their results, including code and details about how experiments are run. Since she joined Meta (then Facebook) in 2017, she has championed that culture in its AI lab.
“That commitment to open science is why I’m here,” she says. “I wouldn’t be here on any other terms.”
Ultimately, Pineau wants to change how we judge AI. “What we call state-of-the-art nowadays can’t just be about performance,” she says. “It has to be state-of-the-art in terms of responsibility as well.”
Still, giving away a large language model is a bold move for Meta. “I can’t tell you that there’s no risk of this model producing language that we’re not proud of,” says Pineau. “It will.”
Weighing the risks
Margaret Mitchell, one of the AI ethics researchers Google forced out in 2020, who is now at Hugging Face, sees the release of OPT as a positive move. But she thinks there are limits to transparency. Has the language model been tested with sufficient rigor? Do the foreseeable benefits outweigh the foreseeable harms—such as the generation of misinformation, or racist and misogynistic language?
“Releasing a large language model to the world where a wide audience is likely to use it, or be affected by its output, comes with responsibilities,” she says. Mitchell notes that this model will be able to generate harmful content not only by itself, but through downstream applications that researchers build on top of it.
Meta AI audited OPT to remove some harmful behaviors, but the point is to release a model that researchers can learn from, warts and all, says Pineau.
“There were a lot of conversations about how to do that in a way that lets us sleep at night, knowing that there’s a non-zero risk in terms of reputation, a non-zero risk in terms of harm,” she says. She dismisses the idea that you should not release a model because it’s too dangerous—which is the reason OpenAI gave for not releasing GPT-3’s predecessor, GPT-2. “I understand the weaknesses of these models, but that’s not a research mindset,” she says.
AI and data fuel innovation in clinical trials and beyond
Laurel: So mentioning the pandemic, it really has shown us how critical and fraught the race is to provide new treatments and vaccines to patients. Could you explain what evidence generation is and then how it fits into drug development?
Arnaub: Sure. So as a concept, generating evidence in drug development is nothing new. It’s the art of putting together data and analyses that successfully demonstrate the safety and the efficacy and the value of your product to a bunch of different stakeholders, regulators, payers, providers, and ultimately, and most importantly, patients. And to date, I’d say evidence generation consists of not only the trial readout itself, but there are now different types of studies that pharmaceutical or medical device companies conduct, and these could be studies like literature reviews or observational data studies or analyses that demonstrate the burden of illness or even treatment patterns. And if you look at how most companies are designed, clinical development teams focus on designing a protocol, executing the trial, and they’re responsible for a successful readout in the trial. And most of that work happens within clinical dev. But as a drug gets closer to launch, health economics, outcomes research, epidemiology teams are the ones that are helping paint what is the value and how do we understand the disease more effectively?
So I think we’re at a pretty interesting inflection point in the industry right now. Generating evidence is a multi-year activity, both during the trial and in many cases long after the trial. And we saw this as especially true for vaccine trials, but also for oncology or other therapeutic areas. In covid, the vaccine companies put together their evidence packages in record time, and it was an incredible effort. And now I think what’s happening is the FDA’s navigating a tricky balance where they want to promote the innovation that we were talking about, the advancements of new therapies to patients. They’ve built in vehicles to expedite therapies such as accelerated approvals, but we need confirmatory trials or long-term follow up to really understand the evidence and to understand the safety and the efficacy of these drugs. And that’s why that concept that we’re talking about today is so important, is how do we do this more expeditiously?
Laurel: It’s certainly important when you’re talking about something that is life-saving innovations, but as you mentioned earlier, with the coming together of both the rapid pace of technology innovation as well as the data being generated and reviewed, we’re at a special inflection point here. So, how has data and evidence generation evolved in the last couple years, and then how different would this ability to create a vaccine and all the evidence packets now be possible five or 10 years ago?
Arnaub: It’s important to set the distinction here between clinical trial data and what’s called real-world data. The randomized controlled trial is, and has remained, the gold standard for evidence generation and submission. And we know within clinical trials, we have a really tightly controlled set of parameters and a focus on a subset of patients. And there’s a lot of specificity and granularity in what’s being captured. There’s a regular interval of assessment, but we also know the trial environment is not necessarily representative of how patients end up performing in the real world. And that term, “real world,” is kind of a wild west of a bunch of different things. It’s claims data or billing records from insurance companies. It’s electronic medical records that emerge out of providers and hospital systems and labs, and even increasingly new forms of data that you might see from devices or even patient-reported data. And RWD, or real-world data, is a large and diverse set of different sources that can capture patient performance as patients go in and out of different healthcare systems and environments.
Ten years ago, when I was first working in this space, the term “real-world data” didn’t even exist. It was like a swear word, and it was basically one that was created in recent years by the pharmaceutical and the regulatory sectors. So, I think what we’re seeing now, the other important piece or dimension is that the regulatory agencies, through very important pieces of legislation like the 21st Century Cures Act, have jump-started and propelled how real-world data can be used and incorporated to augment our understanding of treatments and of disease. So, there’s a lot of momentum here. Real-world data is used in 85%, 90% of FDA-approved new drug applications. So, this is a world we have to navigate.
How do we keep the rigor of the clinical trial and tell the entire story, and then how do we bring in the real-world data to kind of complete that picture? It’s a problem we’ve been focusing on for the last two years, and we’ve even built a solution around this during covid called Medidata Link that actually ties together patient-level data in the clinical trial to all the non-trial data that exists in the world for the individual patient. And as you can imagine, the reason this made a lot of sense during covid, and we actually started this with a covid vaccine manufacturer, was so that we could study long-term outcomes, so that we could tie together that trial data to what we’re seeing post-trial. And does the vaccine make sense over the long term? Is it safe? Is it efficacious? And this is, I think, something that’s going to emerge and has been a big part of our evolution over the last couple years in terms of how we collect data.
Laurel: That collecting data story is certainly part of maybe the challenges in generating this high-quality evidence. What are some other gaps in the industry that you have seen?
Arnaub: I think the elephant in the room for development in the pharmaceutical industry is that despite all the data and all of the advances in analytics, the probability of technical success, or regulatory success as it’s called for drugs, moving forward is still really low. The overall likelihood of approval from phase one consistently sits under 10% for a number of different therapeutic areas. It’s sub 5% in cardiovascular, it’s a little bit over 5% in oncology and neurology, and I think what underlies these failures is a lack of data to demonstrate efficacy. It’s where a lot of companies submit or include what the regulatory bodies call a flawed study design, an inappropriate statistical endpoint, or in many cases, trials are underpowered, meaning the sample size was too small to reject the null hypothesis. So what that means is you’re grappling with a number of key decisions if you look at just the trial itself and some of the gaps where data should be more involved and more influential in decision making.
So, when you’re designing a trial, you’re evaluating, “What are my primary and my secondary endpoints? What inclusion or exclusion criteria do I select? What’s my comparator? What’s my use of a biomarker? And then how do I understand outcomes? How do I understand the mechanism of action?” It’s a myriad of different choices and a permutation of different decisions that have to be made in parallel, all of this data and information coming from the real world; we talked about the momentum in how valuable an electronic health record could be. But the gap here, the problem is, how is the data collected? How do you verify where it came from? Can it be trusted?
So, while volume is good, the gaps actually contribute and there’s a significant chance of bias in a variety of different areas. Selection bias, meaning there’s differences in the types of patients who you select for treatment. There’s performance bias, detection, a number of issues with the data itself. So, I think what we’re trying to navigate here is how can you do this in a robust way where you’re putting these data sets together, addressing some of those key issues around drug failure that I was referencing earlier? Our personal approach has been using a curated historical clinical trial data set that sits on our platform and use that to contextualize what we’re seeing in the real world and to better understand how patients are responding to therapy. And that should, in theory, and what we’ve seen with our work, is help clinical development teams use a novel way to use data to design a trial protocol, or to improve some of the statistical analysis work that they do.
Power beaming comes of age
The global need for power to provide ubiquitous connectivity through 5G, 6G, and smart infrastructure is rising. This report explains the prospects of power beaming; its economic, human, and environmental implications; and the challenges of making the technology reliable, effective, wide-ranging, and secure.
The following are the report’s key findings:
Lasers and microwaves offer distinct approaches to power beaming, each with benefits and drawbacks. While microwave-based power beaming has a more established track record thanks to lower cost of equipment, laser-based approaches are showing promise, backed by an increasing flurry of successful trials and pilots. Laser-based beaming has high-impact prospects for powering equipment in remote sites, the low-earth orbit economy, electric transportation, and underwater applications. Lasers’ chief advantage is the narrow concentration of beams, which enables smaller trans- mission and receiver installations. On the other hand, their disadvantage is the disturbance caused by atmospheric conditions and human interruption, although there are ongoing efforts to tackle these deficits.
Power beaming could quicken energy decarbonization, boost internet connectivity, and enable post-disaster response. Climate change is spurring investment in power beaming, which can support more radical approaches to energy transition. Due to solar energy’s continuous availability, beaming it directly from space to Earth offers superior conversion compared to land-based solar panels when averaged over time. Electric transportation—from trains to planes or drones—benefits from power beaming by avoiding the disruption and costs caused by cabling, wiring, or recharge landings.
Beaming could also transfer power from remote renewables sites such as offshore wind farms. Other areas where power beaming could revolutionize energy solutions include refueling space missions and satellites, 5G provision, and post-disaster humanitarian response in remote regions or areas where networks have collapsed due to extreme weather events, whose frequency will be increased by climate change. In the short term, as efficiencies continue to improve, power beaming has the capacity to reduce the number of wasted batteries, especially in low-power, across-the- room applications.
Public engagement and education are crucial to support the uptake of power beaming. Lasers and microwaves may conjure images of death rays and unanticipated health risks. Public backlash against 5G shows the importance of education and information about the safety of new, “invisible” technologies. Based on decades of research, power beaming via both microwaves and lasers has been shown to be safe. The public is comfortable living amidst invisible forces like wi-fi and wireless data transfer; power beaming is simply the newest chapter.
Commercial investment in power beaming remains muted due to a combination of historical skepticism and uncertain time horizons. While private investment in futuristic sectors like nuclear fusion energy and satellites booms, the power-beaming sector has received relatively little investment and venture capital relative to the scale of the opportunity. Experts believe this is partly a “first-mover” problem as capital allocators await signs of momentum. It may be a hangover of past decisions to abandon beaming due to high costs and impracticality, even though such reticence was based on earlier technologies that have now been surpassed. Power beaming also tends to fall between two R&D comfort zones for large corporations: it does not deliver short-term financial gain, but it is also not long term enough to justify a steady financing stream.
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.
The porcelain challenge didn’t need to be real to get views
“I’ve dabbled in the past with trying to make fake news that is transparent about being fake but spreads nonetheless,” Durfee said. (He once, with a surprising amount of success, got a false rumor started that longtime YouTuber Hank Green had been arrested as a teenager for trying to steal a lemur from a zoo.)
On Sunday, Durfee and his friends watched as #PorcelainChallenge gained traction, and they celebrated when it generated its first media headline (“TikTok’s porcelain challenge is not real but it’s not something to joke about either”). A steady parade of other headlines, some more credulous than others, followed.
But reflex-dependent viral content has a short life span. When Durfee and I chatted three days after he posted his first video about the porcelain challenge, he already could tell that it wasn’t going to catch as widely as he’d hoped. RIP.
Nevertheless, viral moments can be reanimated with just the slightest touch of attention, becoming an undead trend ambling through Facebook news feeds and panicked parent groups. Stripping away their original context can only make them more powerful. And dubious claims about viral teen challenges are often these sorts of zombies—sometimes giving them a second life that’s much bigger (and arguably more dangerous) than the first.
For every “cinnamon challenge” (a real early-2010s viral challenge that made the YouTube rounds and put participants at risk for some nasty health complications), there are even more dumb ideas on the internet that do not trend until someone with a large audience of parents freaks out about them.
Just a couple of weeks ago, for instance, the US Food and Drug Administration issued a warning about boiling chicken in NyQuil, prompting a panic over a craze that would endanger Gen Z lives in the name of views. Instead, as Buzzfeed News reported, the warning itself was the most viral thing about NyQuil chicken, spiking interest in a “trend” that was not trending.
And in 2018, there was the “condom challenge,” which gained widespread media coverage as the latest life-threatening thing teens were doing online for attention—“uncovered” because a local news station sat in on a presentation at a Texas school on the dangers teens face. In reality, the condom challenge had a few minor blips of interest online in 2007 and 2013, but videos of people actually trying to snort a condom up their nose were sparse. In each case, the fear of teens flocking en masse to take part in a dangerous challenge did more to amplify it to a much larger audience than the challenge was able to do on its own.
The porcelain challenge has all the elements of future zombie content. Its catchy name stands out like a bite on the arm. The posts and videos seeded across social media by Durfee’s followers—and the secondary audience coming across the work of those Durfee deputized—are plausible and context-free.