Automated techniques could make it easier to develop AI
“BERT takes months of computation and is very expensive—like, a million dollars to generate that model and repeat those processes,” Bahrami says. “So if everyone wants to do the same thing, then it’s expensive—it’s not energy efficient, not good for the world.”
Although the field shows promise, researchers are still searching for ways to make autoML techniques more computationally efficient. For example, methods like neural architecture search currently build and test many different models to find the best fit, and the energy it takes to complete all those iterations can be significant.
AutoML techniques can also be applied to machine-learning algorithms that don’t involve neural networks, like creating random decision forests or support-vector machines to classify data. Research in those areas is further along, with many coding libraries already available for people who want to incorporate autoML techniques into their projects.
The next step is to use autoML to quantify uncertainty and address questions of trustworthiness and fairness in the algorithms, says Hutter, a conference organizer. In that vision, standards around trustworthiness and fairness would be akin to any other machine-learning constraints, like accuracy. And autoML could capture and automatically correct biases found in those algorithms before they’re released.
The search continues
But for something like deep learning, autoML still has a long way to go. Data used to train deep-learning models, like images, documents, and recorded speech, is usually dense and complicated. It takes immense computational power to handle. The cost and time for training these models can be prohibitive for anyone other than researchers working at deep-pocketed private companies.
One of the competitions at the conference asked participants to develop energy-efficient alternative algorithms for neural architecture search. It’s a considerable challenge because this technique has infamous computational demands. It automatically cycles through countless deep-learning models to help researchers pick the right one for their application, but the process can take months and cost over a million dollars.
The goal of these alternative algorithms, called zero-cost neural architecture search proxies, is to make neural architecture search more accessible and environmentally friendly by significantly cutting down on its appetite for computation. The result takes only a few seconds to run, instead of months. These techniques are still in the early stages of development and are often unreliable, but machine-learning researchers predict that they have the potential to make the model selection process much more efficient.
The Download: how we can limit global warming, and GPT-4’s early adopters
Time is running short to limit global warming to 1.5°C (2.7 °F) above preindustrial levels, but there are feasible and effective solutions on the table, according to a new UN climate report.
Despite decades of warnings from scientists, global greenhouse-gas emissions are still climbing, hitting a record high in 2022. If humanity wants to limit the worst effects of climate change, annual greenhouse-gas emissions will need to be cut by nearly half between now and 2030, according to the report.
That will be complicated and expensive. But it is nonetheless doable, and the UN listed a number of specific ways we can achieve it. Read the full story.
How people are using GPT-4
Last week was intense for AI news, with a flood of major product releases from a number of leading companies. But one announcement outshined them all: OpenAI’s new multimodal large language model, GPT-4. William Douglas Heaven, our senior AI editor, got an exclusive preview. Read about his initial impressions.
Unlike OpenAI’s viral hit ChatGPT, which is freely accessible to the general public, GPT-4 is currently accessible only to developers. It’s still early days for the tech, and it’ll take a while for it to feed through into new products and services. Still, people are already testing its capabilities out in the open. Read about some of the most fun and interesting ways they’re doing that, from hustling up money to writing code to reducing doctors’ workloads.
Google just launched Bard, its answer to ChatGPT—and it wants you to make it better
Google has a lot riding on this launch. Microsoft partnered with OpenAI to make an aggressive play for Google’s top spot in search. Meanwhile, Google blundered straight out of the gate when it first tried to respond. In a teaser clip for Bard that the company put out in February, the chatbot was shown making a factual error. Google’s value fell by $100 billion overnight.
Google won’t share many details about how Bard works: large language models, the technology behind this wave of chatbots, have become valuable IP. But it will say that Bard is built on top of a new version of LaMDA, Google’s flagship large language model. Google says it will update Bard as the underlying tech improves. Like ChatGPT and GPT-4, Bard is fine-tuned using reinforcement learning from human feedback, a technique that trains a large language model to give more useful and less toxic responses.
Google has been working on Bard for a few months behind closed doors but says that it’s still an experiment. The company is now making the chatbot available for free to people in the US and the UK who sign up to a waitlist. These early users will help test and improve the technology. “We’ll get user feedback, and we will ramp it up over time based on that feedback,” says Google’s vice president of research, Zoubin Ghahramani. “We are mindful of all the things that can go wrong with large language models.”
But Margaret Mitchell, chief ethics scientist at AI startup Hugging Face and former co-lead of Google’s AI ethics team, is skeptical of this framing. Google has been working on LaMDA for years, she says, and she thinks pitching Bard as an experiment “is a PR trick that larger companies use to reach millions of customers while also removing themselves from accountability if anything goes wrong.”
Google wants users to think of Bard as a sidekick to Google Search, not a replacement. A button that sits below Bard’s chat widget says “Google It.” The idea is to nudge users to head to Google Search to check Bard’s answers or find out more. “It’s one of the things that help us offset limitations of the technology,” says Krawczyk.
“We really want to encourage people to actually explore other places, sort of confirm things if they’re not sure,” says Ghahramani.
This acknowledgement of Bard’s flaws has shaped the chatbot’s design in other ways, too. Users can interact with Bard only a handful of times in any given session. This is because the longer large language models engage in a single conversation, the more likely they are to go off the rails. Many of the weirder responses from Bing Chat that people have shared online emerged at the end of drawn-out exchanges, for example.
Google won’t confirm what the conversation limit will be for launch, but it will be set quite low for the initial release and adjusted depending on user feedback.
Google is also playing it safe in terms of content. Users will not be able to ask for sexually explicit, illegal, or harmful material (as judged by Google) or personal information. In my demo, Bard would not give me tips on how to make a Molotov cocktail. That’s standard for this generation of chatbot. But it would also not provide any medical information, such as how to spot signs of cancer. “Bard is not a doctor. It’s not going to give medical advice,” says Krawczyk.
Perhaps the biggest difference between Bard and ChatGPT is that Bard produces three versions of every response, which Google calls “drafts.” Users can click between them and pick the response they prefer, or mix and match between them. The aim is to remind people that Bard cannot generate perfect answers. “There’s the sense of authoritativeness when you only see one example,” says Krawczyk. “And we know there are limitations around factuality.”
How AI experts are using GPT-4
Hoffman got access to the system last summer and has since been writing up his thoughts on the different ways the AI model could be used in education, the arts, the justice system, journalism, and more. In the book, which includes copy-pasted extracts from his interactions with the system, he outlines his vision for the future of AI, uses GPT-4 as a writing assistant to get new ideas, and analyzes its answers.
A quick final word … GPT-4 is the cool new shiny toy of the moment for the AI community. There’s no denying it is a powerful assistive technology that can help us come up with ideas, condense text, explain concepts, and automate mundane tasks. That’s a welcome development, especially for white-collar knowledge workers.
However, it’s notable that OpenAI itself urges caution around use of the model and warns that it poses several safety risks, including infringing on privacy, fooling people into thinking it’s human, and generating harmful content. It also has the potential to be used for other risky behaviors we haven’t encountered yet. So by all means, get excited, but let’s not be blinded by the hype. At the moment, there is nothing stopping people from using these powerful new models to do harmful things, and nothing to hold them accountable if they do.
Chinese tech giant Baidu just released its answer to ChatGPT
So. Many. Chatbots. The latest player to enter the AI chatbot game is Chinese tech giant Baidu. Late last week, Baidu unveiled a new large language model called Ernie Bot, which can solve math questions, write marketing copy, answer questions about Chinese literature, and generate multimedia responses.
A Chinese alternative: Ernie Bot (the name stands for “Enhanced Representation from kNowledge IntEgration;” its Chinese name is 文心一言, or Wenxin Yiyan) performs particularly well on tasks specific to Chinese culture, like explaining a historical fact or writing a traditional poem. Read more from my colleague Zeyi Yang.
Even Deeper Learning
Language models may be able to “self-correct” biases—if you ask them to
Large language models are infamous for spewing toxic biases, thanks to the reams of awful human-produced content they get trained on. But if the models are large enough, they may be able to self-correct for some of these biases. Remarkably, all we might have to do is ask.