This AI could predict 10 years of scientific priorities—if we let it
The survey committee, which receives input from a host of smaller panels, takes into account a gargantuan amount of information to create research strategies. Although the Academies won’t release the committee’s final recommendation to NASA for a few more weeks, scientists are itching to know which of their questions will make it in, and which will be left out.
“The Decadal Survey really helps NASA decide how they’re going to lead the future of human discovery in space, so it’s really important that they’re well informed,” says Brant Robertson, a professor of astronomy and astrophysics at UC Santa Cruz.
One team of researchers wants to use artificial intelligence to make this process easier. Their proposal isn’t for a specific mission or line of questioning; rather, they say, their AI can help scientists make tough decisions about which other proposals to prioritize.
The idea is that by training an AI to spot research areas that are either growing or declining rapidly, the tool could make it easier for survey committees and panels to decide what should make the list.
“What we wanted was to have a system that would do a lot of the work that the Decadal Survey does, and let the scientists working on the Decadal Survey do what they will do best,” says Harley Thronson, a retired senior scientist at NASA’s Goddard Space Flight Center and lead author of the proposal.
Although members of each committee are chosen for their expertise in their respective fields, it’s impossible for every member to grasp the nuance of every scientific theme. The number of astrophysics publications increases by 5% every year, according to the authors. That’s a lot for anyone to process.
That’s where Thronson’s AI comes in.
It took just over a year to build, but eventually, Thronson’s team was able to train it on more than 400,000 pieces of research published in the decade leading up to the Astro2010 survey. They were also able to teach the AI to sift through thousands of abstracts to identify both low- and high-impact areas from two- and three-word topic phrases like “planetary system” or “extrasolar planet.”
According to the researchers’ white paper, the AI successfully “backcasted” six popular research themes of the last 10 years, including a meteoric rise in exoplanet research and observation of galaxies.
“One of the challenging aspects of artificial intelligence is that they sometimes will predict, or come up with, or analyze things that are completely surprising to the humans,” says Thronson. “And we saw this a lot.”
Thronson and his collaborators think the steering committee should use their AI to help review and summarize the vast amounts of text the panel must sift through, leaving human experts to make the final call.
Their research isn’t the first to try to use AI to analyze and shape scientific literature. Other AIs have already been used to help scientists peer-review their colleagues’ work.
But could it be trusted with a task as important and influential as the Decadal Survey?
ChatGPT is about to revolutionize the economy. We need to decide what that looks like.
When Anton Korinek, an economist at the University of Virginia and a fellow at the Brookings Institution, got access to the new generation of large language models such as ChatGPT, he did what a lot of us did: he began playing around with them to see how they might help his work. He carefully documented their performance in a paper in February, noting how well they handled 25 “use cases,” from brainstorming and editing text (very useful) to coding (pretty good with some help) to doing math (not great).
ChatGPT did explain one of the most fundamental principles in economics incorrectly, says Korinek: “It screwed up really badly.” But the mistake, easily spotted, was quickly forgiven in light of the benefits. “I can tell you that it makes me, as a cognitive worker, more productive,” he says. “Hands down, no question for me that I’m more productive when I use a language model.”
When GPT-4 came out, he tested its performance on the same 25 questions that he documented in February, and it performed far better. There were fewer instances of making stuff up; it also did much better on the math assignments, says Korinek.
Since ChatGPT and other AI bots automate cognitive work, as opposed to physical tasks that require investments in equipment and infrastructure, a boost to economic productivity could happen far more quickly than in past technological revolutions, says Korinek. “I think we may see a greater boost to productivity by the end of the year—certainly by 2024,” he says.
What’s more, he says, in the longer term, the way the AI models can make researchers like himself more productive has the potential to drive technological progress.
That potential of large language models is already turning up in research in the physical sciences. Berend Smit, who runs a chemical engineering lab at EPFL in Lausanne, Switzerland, is an expert on using machine learning to discover new materials. Last year, after one of his graduate students, Kevin Maik Jablonka, showed some interesting results using GPT-3, Smit asked him to demonstrate that GPT-3 is, in fact, useless for the kinds of sophisticated machine-learning studies his group does to predict the properties of compounds.
“He failed completely,” jokes Smit.
It turns out that after being fine-tuned for a few minutes with a few relevant examples, the model performs as well as advanced machine-learning tools specially developed for chemistry in answering basic questions about things like the solubility of a compound or its reactivity. Simply give it the name of a compound, and it can predict various properties based on the structure.
Newly revealed coronavirus data has reignited a debate over the virus’s origins
Data collected in 2020—and kept from public view since then—potentially adds weight to the animal theory. It highlights a potential suspect: the raccoon dog. But exactly how much weight it adds depends on who you ask. New analyses of the data have only reignited the debate, and stirred up some serious drama.
The current ruckus starts with a study shared by Chinese scientists back in February 2022. In a preprint (a scientific paper that has not yet been peer-reviewed or published in a journal), George Gao of the Chinese Center for Disease Control and Prevention (CCDC) and his colleagues described how they collected and analyzed 1,380 samples from the Huanan Seafood Market.
These samples were collected between January and March 2020, just after the market was closed. At the time, the team wrote that they only found coronavirus in samples alongside genetic material from people.
There were a lot of animals on sale at this market, which sold more than just seafood. The Gao paper features a long list, including chickens, ducks, geese, pheasants, doves, deer, badgers, rabbits, bamboo rats, porcupines, hedgehogs, crocodiles, snakes, and salamanders. And that list is not exhaustive—there are reports of other animals being traded there, including raccoon dogs. We’ll come back to them later.
But Gao and his colleagues reported that they didn’t find the coronavirus in any of the 18 species of animal they looked at. They suggested that it was humans who most likely brought the virus to the market, which ended up being the first known epicenter of the outbreak.
Fast-forward to March 2023. On March 4, Florence Débarre, an evolutionary biologist at Sorbonne University in Paris, spotted some data that had been uploaded to GISAID, a website that allows researchers to share genetic data to help them study and track viruses that cause infectious diseases. The data appeared to have been uploaded in June 2022. It seemed to have been collected by Gao and his colleagues for their February 2022 study, although it had not been included in the actual paper.
Fostering innovation through a culture of curiosity
And so I think a big part of it as a company, by setting these ambitious goals, it forces us to say if we want to be number one, if we want to be top tier in these areas, if we want to continue to generate results, how do we get there using technology? And so that really forces us to throw away our assumptions because you can’t follow somebody, if you want to be number one you can’t follow someone to become number one. And so we understand that the path to get there, it’s through, of course, technology and the software and the enablement and the investment, but it really is by becoming goal-oriented. And if we look at these examples of how do we create the infrastructure on the technology side to support these ambitious goals, we ourselves have to be ambitious in turn because if we bring a solution that’s also a me too, that’s a copycat, that doesn’t have differentiation, that’s not going to propel us, for example, to be a top 10 supply chain. It just doesn’t pass muster.
So I think at the top level, it starts with the business ambition. And then from there we can organize ourselves at the intersection of the business ambition and the technology trends to have those very rich discussions and being the glue of how do we put together so many moving pieces because we’re constantly scanning the technology landscape for new advancing and emerging technologies that can come in and be a part of achieving that mission. And so that’s how we set it up on the process side. As an example, I think one of the things, and it’s also innovation, but it doesn’t get talked about as much, but for the community out there, I think it’s going to be very relevant is, how do we stay on top of the data sovereignty questions and data localization? There’s a lot of work that needs to go into rethinking what your cloud, private, public, edge, on-premise look like going forward so that we can remain cutting edge and competitive in each of our markets while meeting the increasing guidance that we’re getting from countries and regulatory agencies about data localization and data sovereignty.
And so in our case, as a global company that’s listed in Hong Kong and we operate all around the world, we’ve had to really think deeply about the architecture of our solutions and apply innovation in how we can architect for a longer term growth, but in a world that’s increasingly uncertain. So I think there’s a lot of drivers in some sense, which is our corporate aspirations, our operating environment, which has continued to have a lot of uncertainty, and that really forces us to take a very sharp lens on what cutting edge looks like. And it’s not always the bright and shiny technology. Cutting edge could mean going to the executive committee and saying, Hey, we’re going to face a challenge about compliance. Here’s the innovation we’re bringing about architecture so that we can handle not just the next country or regulatory regime that we have to comply with, but the next 10, the next 50.
Laurel: Well, and to follow up with a bit more of a specific example, how does R&D help improve manufacturing in the software supply chain as well as emerging technologies like artificial intelligence and the industrial metaverse?
Art: Oh, I love this one because this is the perfect example of there’s a lot happening in the technology industry and there’s so much back to the earlier point of applied curiosity and how we can try this. So specifically around artificial intelligence and industrial metaverse, I think those go really well together with what are Lenovo’s natural strengths. Our heritage is as a leading global manufacturer, and now we’re looking to also transition to services-led, but applying AI and technologies like the metaverse to our factories. I think it’s almost easier to talk about the inverse, Laurel, which is if we… Because, and I remember very clearly we’ve mapped this out, there’s no area within the supply chain and manufacturing that is not touched by these areas. If I think about an example, actually, it’s very timely that we’re having this discussion. Lenovo was recognized just a few weeks ago at the World Economic Forum as part of the global lighthouse network on leading manufacturing.
And that’s based very much on applying around AI and metaverse technologies and embedding them into every aspect of what we do about our own supply chain and manufacturing network. And so if I pick a couple of examples on the quality side within the factory, we’ve implemented a combination of digital twin technology around how we can design to cost, design to quality in ways that are much faster than before, where we can prototype in the digital world where it’s faster and lower cost and correcting errors is more upfront and timely. So we are able to much more quickly iterate on our products. We’re able to have better quality. We’ve taken advanced computer vision so that we’re able to identify quality defects earlier on. We’re able to implement technologies around the industrial metaverse so that we can train our factory workers more effectively and better using aspects of AR and VR.
And we’re also able to, one of the really important parts of running an effective manufacturing operation is actually production planning, because there’s so many thousands of parts that are coming in, and I think everyone who’s listening knows how much uncertainty and volatility there have been in supply chains. So how do you take such a multi-thousand dimensional planning problem and optimize that? Those are things where we apply smart production planning models to keep our factories fully running so that we can meet our customer delivery dates. So I don’t want to drone on, but I think literally the answer was: there is no place, if you think about logistics, planning, production, scheduling, shipping, where we didn’t find AI and metaverse use cases that were able to significantly enhance the way we run our operations. And again, we’re doing this internally and that’s why we’re very proud that the World Economic Forum recognized us as a global lighthouse network manufacturing member.
Laurel: It’s certainly important, especially when we’re bringing together computing and IT environments in this increasing complexity. So as businesses continue to transform and accelerate their transformations, how do you build resiliency throughout Lenovo? Because that is certainly another foundational characteristic that is so necessary.