Connect with us


Learning about AI with Google Brain and Landing AI founder Andrew Ng



Learning about AI with Google Brain and Landing AI founder Andrew Ng

This interview has been condensed and lightly edited for clarity.

MIT Technology Review: I’m sure people frequently ask you, “How do I build an AI-first business?” What do you usually say to that?

Andrew Ng: I usually say, “Don’t do that.” If I go to a team and say, “Hey, everyone, please be AI-first,” that tends to focus the team on technology, which might be great for a research lab. But in terms of how I execute the business, I tend to be customer-led or mission-led, almost never technology-led.

You now have this new venture called Landing AI. Can you tell us a bit about what it is, and why you chose to work on it?

After heading the AI teams at Google and Baidu, I realized that AI has transformed software consumer internet, like web search and online advertising. But I wanted to take AI to all of the other industries, which is an even bigger part of the economy. So after looking at a lot of different industries, I decided to focus on manufacturing. I think that multiple industries are AI-ready, but one of the patterns for an industry being more AI-ready is if it’s undergone some digital transformation so there’s some data. That creates an opportunity for AI teams to come in to use the data to create value.

So one of the projects that I’ve been excited about recently is manufacturing visual inspection. Can you look at a picture of a smartphone coming off the manufacturing line and see if there’s a defect in it? Or look at an auto component and see if there’s a dent in it? One huge difference is in consumer software internet, maybe you have a billion users and a huge amount of data. But in manufacturing, no factory has manufactured a billion or even a million scratched smartphones. Thank goodness for that. So the challenge is, can you get an AI to work with a hundred images? It turns out often you can. I’ve actually been surprised quite a lot of times with how much you can do with even modest amounts of data. And so even though all the hype and excitement and PR around AI is on the giant data sets, I feel like there’s a lot of room we need to grow as well to break open these other applications where the challenges are quite different.

How do you do that?

A very frequent mistake I see CEOs and CIOs make: they say to me something like “Hey, Andrew, we don’t have that much data—my data’s a mess. So give me two years to build a great IT infrastructure. Then we’ll have all this great data on which to build AI.” I always say, “That’s a mistake. Don’t do that.” First, I don’t think any company on the planet today—maybe not even the tech giants—thinks their data is completely clean and perfect. It’s a journey. Spending two or three years to build a beautiful data infrastructure means that you’re lacking feedback from the AI team to help prioritize what IT infrastructure to build.

For example, if you have a lot of users, should you prioritize asking them questions in a survey to get a little bit more data? Or in a factory, should you prioritize upgrading the sensor from something that records the vibrations 10 times a second to maybe 100 times a second? It is often starting to do an AI project with the data you already have that enables an AI team to give you the feedback to help prioritize what additional data to collect.

In industries where we just don’t have the scale of consumer software internet, I feel like we need to shift in mindset from big data to good data. If you have a million images, go ahead, use it—that’s great. But there are lots of problems that can use much smaller data sets that are cleanly labeled and carefully curated.

Could you give an example? What do you mean by good data?

Let me first give an example from speech recognition. When I was working with voice search, you would get audio clips where you would hear someone say, “Um today’s weather.” The question is, what is the right transcription for that audio clip? Is it “Um (comma) today’s weather,” or is it “Um (dot, dot, dot)  today’s weather,” or is the “Um” something we just don’t transcribe? It turns out any one of these is fine, but what is not fine is if different transcribers use each of the three labeling conventions. Then your data is noisy, and it hurts the speech recognition system. Now, when you have millions or a billion users, you can have that noisy data and just average it—the learning algorithm will do fine. But if you are in a setting where you have a smaller data set—say, a hundred examples—then this type of noisy data has a huge impact on performance.

Another example from manufacturing: we did a lot of work on steel inspection. If you drive a car, the side of your car was once made of a sheet of steel. Sometimes there are little wrinkles in the steel, or little dents or specks on it. So you can use a camera and computer vision to see if there are defects or not. But different labelers will label the data differently. Some will put a giant bounding box around the whole region. Some will put little bounding boxes around the little particles. When you have a modest data set, making sure that the different quality inspectors label the data consistently—that turns out to be one of the most important things.

For a lot of AI projects, the open-source model you download off GitHub—the neural network that you can get from literature—is good enough. Not for all problems, but the main problems. So I’ve gone to many of my teams and said, “Hey, everyone, the neural network is good enough. Let’s not mess with the code anymore. The only thing you’re going to do now is build processes to improve the quality of the data.” And it turns out that often results in faster improvements to performance of the algorithm.

What is the data size you are thinking about when you say smaller data sets? Are you talking about a hundred examples? Ten examples?

Machine learning is so diverse that it’s become really hard to give one-size-fits-all answers. I’ve worked on problems where I had about 200 to 300 million images. I’ve also worked on problems where I had 10 images, and everything in between. When I look at manufacturing applications, I think something like tens or maybe a hundred images for a defect class is not unusual, but there’s very wide variance even within the factory.

I do find that the AI practices switch over when the training set sizes go under, let’s say, 10,000 examples, because that’s sort of the threshold where the engineer can basically look at every example and design it themselves and then make a decision.

Recently I was chatting with a very good engineer in one of the large tech companies. And I asked, “Hey, what do you do if the labels are inconsistent?” And he said, “Well, we have this team of several hundred people overseas that does the labeling. So I’ll write the labeling instructions, get three people to label every image, and then I’ll take an average.” And I said, “Yep, that’s the right thing to do when you have a giant data set.” But when I work with a smaller team and the labels are inconsistent, I just track down the two people that disagree with each other, get both of them on a Zoom call, and have them talk to each other to try to reach a resolution.

I want to turn our attention now to talk about your thoughts on the general AI industry. The Algorithm is our AI newsletter, and I gave our readers an opportunity to submit some questions to you in advance. One reader asks: AI development seems to have mostly bifurcated toward either academic research or large-scale, resource-intensive, big company programs like OpenAI and DeepMind. That doesn’t really leave a lot of space for small startups to contribute. What do you think are some practical problems that smaller companies can really focus on to help drive real commercial adoption of AI?

I think a lot of the media attention tends to be on the large corporations, and sometimes on the large academic institutions. But if you go to academic conferences, there’s plenty of work done by smaller research groups and research labs. And when I speak with different people in different companies and industries, I feel like there are so many business applications they could use AI to tackle. I usually go to business leaders and ask, “What are your biggest business problems? What are the things that worry you the most?” so I can better understand the goals of the business and then brainstorm whether or not there is an AI solution. And sometimes there isn’t, and that’s fine.

Maybe I’ll just mention a couple of gaps that I find exciting. I think that today building AI systems is still very manual. You have a few brilliant machine-learning engineers and data scientists do things in a computer and then push things to production. There’s a lot of manual steps in the process. So I’m excited about ML ops [machine learning operations] as an emerging discipline to help make the process of building and deploying AI systems more systematic.

Also, if you look at a lot of the typical business problems—all the functions from marketing to talent—there’s a lot of room for automation and efficiency improvement.

I also hope that the AI community can look at the biggest social problems—see what we can do for climate change or homelessness or poverty. In addition to the sometimes very valuable business problems, we should work on the biggest social problems too.

How do you actually go about the process of identifying whether there is an opportunity to pursue something with machine learning for your business?

I will try to learn a little bit about the business myself and try to help the business leaders learn a little bit about AI. Then we usually brainstorm a set of projects, and for each of the ideas, I will do both technical diligence and business diligence. We’ll look at: Do you have enough data? What’s the accuracy? Is there a long tail when you deploy into production? How do you fill the data back and close the loop for continuous learning? So—making sure the problem is technically feasible. And then business diligence: we make sure that this will achieve the ROI that we’re hoping for. After that process, you have the usual, like estimating the resources, milestones, and then hopefully going into execution.

One other suggestion: it’s more important to start quickly, and it’s okay to start small. My first meaningful business application at Google was speech recognition, not web search or advertising. But by helping the Google speech team make speech recognition more accurate, that gave the Brain team the credibility and the wherewithal to go after bigger and bigger partnerships. So Google Maps was the second big partnership where we used computer vision—to read house numbers to geolocate houses on Google maps. And only after those first two successful projects did I have a more serious conversation with the advertising team. So I think I see more companies fail by starting too big than fail by starting too small. It’s fine to do a smaller project to get started as an organization to learn what it feels like to use AI, and then go on to build bigger successes.

What is one thing that our audience should start doing tomorrow to implement AI in their companies?

Jump in. AI is causing a shift in the dynamics of many industries. So if your company isn’t already making pretty aggressive and smart investments, this is a good time.


Audio Postcard: Real-time farming



Audio Postcard: Real-time farming

Pinot Grigio actually makes a white wine and it’s won a few varieties in California that, uh, is a pretty common variety that actually we make purple grapes that make a white wine. So my name is Dirk Heuvel and I’m the VP of vineyard operations here at McManis family vineyards. 

My family actually kind of set roots here, actually farming almonds. And some people say almonds, we say in Ripon, and we say, say, almonds. 

I feel like, if it was like my dad or my grandpa trying to adopt this technology, absolutely. I think there’d be a huge culture shock there for them. I still think they don’t quite understand it, but they’re seeing the results of it. So I think that’s the most important thing—that we’re able to show them that it is working and how it’s working for us.

I will say today, I feel that we’re growing better quality grapes than we were 30 years ago. Just adapting a lot of this aerial imagery, modern irrigation technology, running drip system technology, you know, being able to fertilize through drip systems. And you can actually look at the imaging on your phone and you can actually pinpoint go out and walk to a specific vine. You know, that might be a   vine that died, that shows up on the aerial imaging. You can use the technology and, and walk right into a specific area. Just being able to identify areas, you know, using GPS. We can have field checkers go through the field now and on their app, they’re able to actually drop and pinpoint where we might have mite issues where we might have, you know, leafhopper issues, areas that need to get treated. And that actually allows us to go through and just cite specific treat. Instead of treating an entire vineyard block, we’re able to just treat specific areas.

Jennifer: It was only what like five, seven years ago, it was half of farm workers weren’t using smartphones. 

Dirk Heuvel: Yeah. 

Jennifer: So, if people are dropping pins that’s…

Dirk Heuvel: Yeah. You know, 30 years ago, in order to make a phone call, you’d have to drive in a, in a town or go to your house to call your irrigator to do stuff. And now it’s, this is almost, it’s like real time farming. Now we can make decisions on the fly. And one of the big advantages to using variable rate applications is that you’re only applying the amount of nutrients or amendments that are needed for a specific area. So before we adapted this variable rate technology, we would drive down a row and we would put a consistent amount of amendments, whether it be gypsum, lime, soil, sulfur, we would apply that amount evenly throughout the entire vineyard block. Now we realize going through and using this variable rate technology is that we might cut the, the amendments that are needed by 20 to 30% on a specific vineyard block, just by applying the correct amounts of nutrients where they’re needed and not overlying where they’re not needed 

Continue Reading


The Download: dual-driving AI, and Russia’s Telegram propaganda




This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

This startup’s AI is smart enough to drive different types of vehicles

The news: Wayve, a driverless-car startup based in London, has made a machine-learning model that can drive two different types of vehicle: a passenger car and a delivery van. It is the first time the same AI driver has learned to drive multiple vehicles.

Why it matters: While robotaxis have made it to a handful of streets in Phoenix and San Francisco, their success has been limited. Wayve is part of a new generation of startups ditching the traditional robotics mindset—where driverless cars rely on super-detailed 3D maps and modules for sensing and planning. Instead, these startups rely entirely on AI to drive the vehicles.

What’s next: The advance suggests that Wayve’s approach to autonomous vehicles, in which a deep-learning model is trained to drive from scratch, could help it scale up faster than its leading rivals. Read the full story.

—Will Douglas Heaven

Russia’s battle to convince people to join its war is being waged on Telegram

Putin’s propaganda: When Vladimir Putin declared the partial call-up of military reservists on September 21, in a desperate effort to try to turn his long and brutal war in Ukraine in Russia’s favor, he kicked off another, parallel battle: one to convince the Russian people of the merits and risks of conscription. And this one is being fought on the encrypted messaging service Telegram.

Opposing forces: Following the announcement, pro-Kremlin Telegram channels began to line up dutifully behind Putin’s plans, eager to promote the idea that the war he is waging is just and winnable.  But whether this vein of propaganda is working is far from certain. For all the work the government is doing to try to control the narrative, there’s a vibrant opposition on the same platform working to undermine it—and offering support for those seeking to dodge the draft. Read the full story.

—Chris Stokel-Walker

NASA’s DART mission is on track to crash into an asteroid today

NASA’s Double Asteroid Redirection Test spacecraft, or DART, is on course to collide with the asteroid Dimorphos at 7.14pm ET today. Though Dimorphos is not about to collide with Earth, DART is intended to demonstrate the ability to deflect an asteroid like it that is headed our way, should one ever be discovered.

Read more about the DART mission, and how the crash is likely to play out.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 The US says Russia will face catastrophe if it uses nuclear weapons
It’s hard to know whether Putin’s threat is a bluff—or deadly serious. (The Guardian)
+ Ukrainian president Volodymyr Zelensky thinks it is very real. (CNBC)
+ What is the risk of a nuclear accident in Ukraine? (MIT Technology Review)

2 YouTube wants to lure creators away from TikTok with cash
But it won’t say how much. (MIT Technology Review)

3 Germany’s zero-tolerance for hate speech is a double-edged sword
While the threat of fines disincentivizes some perpetrators, activists worry that too many people are being targeted. (NYT $)
+ Misinformation is already shaping US voters’ decisions ahead of November’s midterms. (NYT $)

4 Why even the largest companies are vulnerable to hacking
A zero-trust approach is helpful, but will only take you so far. (WSJ $)
+ Hackers can disrupt image-recognition systems using radio waves. (New Scientist $)
+ Microsoft is optimistic that AI can root out bad actors. (Bloomberg $)
+ The hacking industry faces the end of an era. (MIT Technology Review)

5 NASA’s Artemis moon mission has been delayed again
Due to tropical storm Ian. (BBC)
+ Saudi Arabia wants to send its first female astronaut into space. (Insider $)

6 Fighting climate change extends beyond kicking corporations
A more nuanced approach could be required to speed up the transition to cleaner energy. (The Atlantic $)
+ Global wildfires mean that snow is melting quicker than usual. (Slate $)
+ Disaster insurance is increasingly tricky to navigate. (Knowable Magazine)
+ Carbon removal hype is becoming a dangerous distraction. (MIT Technology Review)

7 Crypto’s fired workers don’t know what to do next
But plenty of them haven’t let their experiences put them off the sector. (The Information $)
+ Interpol has issued a red notice for Terraform Labs’ co-founder Do Kwon. (Bloomberg $) 

8 The Danish city that banned Google
The tech giant’s handling of children’s data wasn’t properly assessed. (Wired $)
+ Google says it’s unwilling to pitch it to fund network costs in Europe. (Reuters)

9 Why neuroscience is making a comeback
Some experts are convinced that making neurology and psychiatry departments work closer together is long overdue. (Economist $)

10 How plant-based meat fell out of fashion 🍔
Evangelists are convinced the nascent industry is merely experiencing teething problems. (The Guardian)
+ Your first lab-grown burger is coming soon—and it’ll be “blended”. (MIT Technology Review)

Quote of the day

“There’s definitely the boys’ club that still exists.”

—Taryn Langer, founder of public relations firm Moxie Communications Group, tells the New York Times about her frustrations at the sexist state of the tech industry.

The big story

The quest to learn if our brain’s mutations affect mental health

August 2021

Scientists have struggled in their search for specific genes behind most brain disorders, including autism and Alzheimer’s disease. Unlike problems with some other parts of our body, the vast majority of brain disorder presentations are not linked to an identifiable gene.

But a University of California, San Diego study published in 2001 suggested a different path. What if it wasn’t a single faulty gene—or even a series of genes—that always caused cognitive issues? What if it could be the genetic differences between cells? 

The explanation had seemed far-fetched, but more researchers have begun to take it seriously. Scientists already knew that the 85 billion to 100 billion neurons in your brain work to some extent in concert—but what they want to know is whether there is a risk when some of those cells might be singing a different genetic tune. Read the full story.

—Roxanne Khamsi

We can still have nice things

A place for comfort, fun and distraction in these weird times. (Got any ideas? Drop me a line or tweet ’em at me.)

+ Some gadgets are definitely more useful than others.
+ Calling all cat lovers! This potted history of mischievous felines in French painter Alexandre-François Desportes’ work is heartwarming stuff (thanks Melissa!)
+ A useful guide to working out what you really want from life
+ A Ukrainian startup is reportedly planning to use AI to clone the iconic voice of James Earl Jones, aka Darth Vader. 
+ The rumors are true—butter really is having a moment.

Continue Reading


This startup’s AI is smart enough to drive different types of vehicles



This startup’s AI is smart enough to drive different types of vehicles

Jay Gierak at Ghost, which is based in Mountain View, California, is impressed by Wayve’s demonstrations and agrees with the company’s overall viewpoint. “The robotics approach is not the right way to do this,” says Gierak.

But he’s not sold on Wayve’s total commitment to deep learning. Instead of a single large model, Ghost trains many hundreds of smaller models, each with a specialism. It then hand codes simple rules that tell the self-driving system which models to use in which situations. (Ghost’s approach is similar to that taken by another AV2.0 firm, Autobrains, based in Israel. But Autobrains uses yet another layer of neural networks to learn the rules.)

According to Volkmar Uhlig, Ghost’s co-founder and CTO, splitting the AI into many smaller pieces, each with specific functions, makes it easier to establish that an autonomous vehicle is safe. “At some point, something will happen,” he says. “And a judge will ask you to point to the code that says: ‘If there’s a person in front of you, you have to brake.’ That piece of code needs to exist.” The code can still be learned, but in a large model like Wayve’s it would be hard to find, says Uhlig.

Still, the two companies are chasing complementary goals: Ghost wants to make consumer vehicles that can drive themselves on freeways; Wayve wants to be the first company to put driverless cars in 100 cities. Wayve is now working with UK grocery giants Asda and Ocado, collecting data from their urban delivery vehicles.

Yet, by many measures, both firms are far behind the market leaders. Cruise and Waymo have racked up hundreds of hours of driving without a human in their cars and already offer robotaxi services to the public in a small number of locations.

“I don’t want to diminish the scale of the challenge ahead of us,” says Hawke. “The AV industry teaches you humility.”

Continue Reading

Copyright © 2021 Seminole Press.