Connect with us

Tech

Artificial intelligence, Geoffrey Hinton, neural network, GLOM, vectors, visual perception, human perception, intuition

Published

on

Hinton face grid


Deep learning set off the latest AI revolution, transforming computer vision and the field as a whole. Hinton believes deep learning should be almost all that’s needed to fully replicate human intelligence.

But despite rapid progress, there are still major challenges. Expose a neural net to an unfamiliar data set or a foreign environment, and it reveals itself to be brittle and inflexible. Self-driving cars and essay-writing language generators impress, but things can go awry. AI visual systems can be easily confused: a coffee mug recognized from the side would be an unknown from above if the system had not been trained on that view; and with the manipulation of a few pixels, a panda can be mistaken for an ostrich, or even a school bus.

GLOM addresses two of the most difficult problems for visual perception systems: understanding a whole scene in terms of objects and their natural parts; and recognizing objects when seen from a new viewpoint.(GLOM’s focus is on vision, but Hinton expects the idea could be applied to language as well.)

An object such as Hinton’s face, for instance, is made up of his lively if dog-tired eyes (too many people asking questions; too little sleep), his mouth and ears, and a prominent nose, all topped by a not-too-untidy tousle of mostly gray. And given his nose, he is easily recognized even on first sight in profile view.

Both of these factors—the part-whole relationship and the viewpoint—are, from Hinton’s perspective, crucial to how humans do vision. “If GLOM ever works,” he says, “it’s going to do perception in a way that’s much more human-like than current neural nets.”

Grouping parts into wholes, however, can be a hard problem for computers, since parts are sometimes ambiguous. A circle could be an eye, or a doughnut, or a wheel. As Hinton explains it, the first generation of AI vision systems tried to recognize objects by relying mostly on the geometry of the part-whole-relationship—the spatial orientation among the parts and between the parts and the whole. The second generation instead relied mostly on deep learning—letting the neural net train on large amounts of data. With GLOM, Hinton combines the best aspects of both approaches.

“There’s a certain intellectual humility that I like about it,” says Gary Marcus, founder and CEO of Robust.AI and a well-known critic of the heavy reliance on deep learning. Marcus admires Hinton’s willingness to challenge something that brought him fame, to admit it’s not quite working. “It’s brave,” he says. “And it’s a great corrective to say, ‘I’m trying to think outside the box.’”

The GLOM architecture

In crafting GLOM, Hinton tried to model some of the mental shortcuts—intuitive strategies, or heuristics—that people use in making sense of the world. “GLOM, and indeed much of Geoff’s work, is about looking at heuristics that people seem to have, building neural nets that could themselves have those heuristics, and then showing that the nets do better at vision as a result,” says Nick Frosst, a computer scientist at a language startup in Toronto who worked with Hinton at Google Brain.

With visual perception, one strategy is to parse parts of an object—such as different facial features—and thereby understand the whole. If you see a certain nose, you might recognize it as part of Hinton’s face; it’s a part-whole hierarchy. To build a better vision system, Hinton says, “I have a strong intuition that we need to use part-whole hierarchies.” Human brains understand this part-whole composition by creating what’s called a “parse tree”—a branching diagram demonstrating the hierarchical relationship between the whole, its parts and subparts. The face itself is at the top of the tree, and the component eyes, nose, ears, and mouth form the branches below.

One of Hinton’s main goals with GLOM is to replicate the parse tree in a neural net—this would distinguish it from neural nets that came before. For technical reasons, it’s hard to do. “It’s difficult because each individual image would be parsed by a person into a unique parse tree, so we would want a neural net to do the same,” says Frosst. “It’s hard to get something with a static architecture—a neural net—to take on a new structure—a parse tree—for each new image it sees.” Hinton has made various attempts. GLOM is a major revision of his previous attempt in 2017, combined with other related advances in the field.

“I’m part of a nose!”

GLOM vector

MS TECH | EVIATAR BACH VIA WIKIMEDIA

A generalized way of thinking about the GLOM architecture is as follows: The image of interest (say, a photograph of Hinton’s face) is divided into a grid. Each region of the grid is a “location” on the image—one location might contain the iris of an eye, while another might contain the tip of his nose. For each location in the net there are about five layers, or levels. And level by level, the system makes a prediction, with a vector representing the content or information. At a level near the bottom, the vector representing the tip-of-the-nose location might predict: “I’m part of a nose!” And at the next level up, in building a more coherent representation of what it’s seeing, the vector might predict: “I’m part of a face at side-angle view!”

But then the question is, do neighboring vectors at the same level agree? When in agreement, vectors point in the same direction, toward the same conclusion: “Yes, we both belong to the same nose.” Or further up the parse tree. “Yes, we both belong to the same face.”

Seeking consensus about the nature of an object—about what precisely the object is, ultimately—GLOM’s vectors iteratively, location-by-location and layer-upon-layer, average with neighbouring vectors beside, as well as predicted vectors from levels above and below.

However, the net doesn’t “willy-nilly average” with just anything nearby, says Hinton. It averages selectively, with neighboring predictions that display similarities. “This is kind of well-known in America, this is called an echo chamber,” he says. “What you do is you only accept opinions from people who already agree with you; and then what happens is that you get an echo chamber where a whole bunch of people have exactly the same opinion. GLOM actually uses that in a constructive way.” The analogous phenomenon in Hinton’s system is those “islands of agreement.”

“Geoff is a highly unusual thinker…”

Sue Becker

“Imagine a bunch of people in a room, shouting slight variations of the same idea,” says Frosst—or imagine those people as vectors pointing in slight variations of the same direction. “They would, after a while, converge on the one idea, and they would all feel it stronger, because they had it confirmed by the other people around them.” That’s how GLOM’s vectors reinforce and amplify their collective predictions about an image.

GLOM uses these islands of agreeing vectors to accomplish the trick of representing a parse tree in a neural net. Whereas some recent neural nets use agreement among vectors for activation, GLOM uses agreement for representation—building up representations of things within the net. For instance, when several vectors agree that they all represent part of the nose, their small cluster of agreement collectively represents the nose in the net’s parse tree for the face. Another smallish cluster of agreeing vectors might represent the mouth in the parse tree; and the big cluster at the top of the tree would represent the emergent conclusion that the image as a whole is Hinton’s face. “The way the parse tree is represented here,” Hinton explains, “is that at the object level you have a big island; the parts of the object are smaller islands; the subparts are even smaller islands, and so on.”

Figure 2 from Hinton’s GLOM paper. The islands of identical vectors (arrows of the same color) at the various levels represent a parse tree.

GEOFFREY HINTON

According to Hinton’s long-time friend and collaborator Yoshua Bengio, a computer scientist at the University of Montreal, if GLOM manages to solve the engineering challenge of representing a parse tree in a neural net, it would be a feat—it would be important for making neural nets work properly. “Geoff has produced amazingly powerful intuitions many times in his career, many of which have proven right,” Bengio says. “Hence, I pay attention to them, especially when he feels as strongly about them as he does about GLOM.”

The strength of Hinton’s conviction is rooted not only in the echo chamber analogy, but also in mathematical and biological analogies that inspired and justified some of the design decisions in GLOM’s novel engineering.

“Geoff is a highly unusual thinker in that he is able to draw upon complex mathematical concepts and integrate them with biological constraints to develop theories,” says Sue Becker, a former student of Hinton’s, now a computational cognitive neuroscientist at McMaster University. “Researchers who are more narrowly focused on either the mathematical theory or the neurobiology are much less likely to solve the infinitely compelling puzzle of how both machines and humans might learn and think.”

Turning philosophy into engineering

So far, Hinton’s new idea has been well received, especially in some of the world’s greatest echo chambers. “On Twitter, I got a lot of likes,” he says. And a YouTube tutorial laid claim to the term “MeGLOMania.”

Hinton is the first to admit that at present GLOM is little more than philosophical musing (he spent a year as a philosophy undergrad before switching to experimental psychology). “If an idea sounds good in philosophy, it is good,” he says. “How would you ever have a philosophical idea that just sounds like rubbish, but actually turns out to be true? That wouldn’t pass as a philosophical idea.” Science, by comparison, is “full of things that sound like complete rubbish” but turn out to work remarkably well—for example, neural nets, he says.

GLOM is designed to sound philosophically plausible. But will it work?

Tech

A bot that watched 70,000 hours of Minecraft could unlock AI’s next big thing

Published

on

A bot that watched 70,000 hours of Minecraft could unlock AI’s next big thing


The researchers claim that their approach could be used to train AI to carry out other tasks. To begin with, it could be used to for bots that use a keyboard and mouse to navigate websites, book flights or buy groceries online. But in theory it could be used to train robots to carry out physical, real-world tasks by copying first-person video of people doing those things. “It’s plausible,” says Stone.

Matthew Gudzial at the University of Alberta, Canada, who has used videos to teach AI the rules of games like Super Mario Bros, does not think it will happen any time soon, however. Actions in games like Minecraft and Super Mario Bros. are performed by pressing buttons. Actions in the physical world are far more complicated and harder for a machine to learn. “It unlocks a whole mess of new research problems,” says Gudzial.

“This work is another testament to the power of scaling up models and training on massive datasets to get good performance,” says Natasha Jaques, who works on multi-agent reinforcement learning at Google and the University of California, Berkeley. 

Large internet-sized data sets will certainly unlock new capabilities for AI, says Jaques. “We’ve seen that over and over again, and it’s a great approach.” But OpenAI places a lot of faith in the power of large data sets alone, she says: “Personally, I’m a little more skeptical that data can solve any problem.”

Still, Baker and his colleagues think that collecting more than a million hours of Minecraft videos will make their AI even better. It’s probably the best Minecraft-playing bot yet, says Baker: “But with more data and bigger models I would expect it to feel like you’re watching a human playing the game, as opposed to a baby AI trying to mimic a human.”

Continue Reading

Tech

The Download: AI conquers Minecraft, and babies after death

Published

on

The Download: AI conquers Minecraft, and babies after death


+ Scientists have found a way to mature eggs from transgender men in the lab. It could offer them new ways to start a family—without the need for distressing IVF procedures. Read the full story.  + How reproductive technology is changing what it means to be a parent. Advances could lead to babies with four or more biological parents—forcing us to reconsider parenthood. Read the full story.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Elon Musk wants to reinstate banned Twitter accounts
It’s an incredibly dangerous decision with widespread repercussions. (WP $) 
+ Recent departures have hit Twitter’s policy and safety divisions hard. (WSJ $)
+ It looks like Musk’s promise of no further layoffs was premature. (Insider $)
+ Meanwhile, Twitter Blue is still reportedly launching next week. (Reuters)
+ Imagine simply transferring your followers to another platform. (FT $)
+ Twitter’s potential collapse could wipe out vast records of recent human history. (MIT Technology Review)

2 Russia’s energy withdrawal could kill tens of thousands in Europe 
High fuel costs could result in more deaths this winter than the war in Ukraine. (Economist $)
+ Higher gas prices will also hit Americans as the weather worsens. (Vox)
+ Ukraine’s invasion underscores Europe’s deep reliance on Russian fossil fuels. (MIT Technology Review)

3 FTX is unable to honor the grants it promised various organizations 
Many of them are having to seek emergency funding to plug the gaps. (WSJ $)
+ Bahamians aren’t thrilled about what its collapse could mean for them. (WP $)

4 It’s a quieter Black Friday than usual
Shopping isn’t much of a priority right now. (Bloomberg $)
+ If you do decide to shop, make sure you don’t get scammed. (Wired $)

5 The UK is curbing its use of Chinese surveillance systems 
But only on “sensitive” government sites. (FT $)
+ The world’s biggest surveillance company you’ve never heard of. (MIT Technology Review)

6 Long covid is still incredibly hard to treat 
Its symptoms vary wildy, which can make it hard to track, too. (Undark)
+ A universal flu vaccine is looking promising. (New Scientist $)

7 San Francisco’s police is considering letting robots use deadly force
The force has 12 remotely piloted robots that could, in theory, kill someone. (The Verge)

8 Human hibernation could be the key to getting us to Mars 
It could be the closest we can get to time travel. (Wired $)

9 Why TikTok is suddenly so obsessed with dabloons 
It’s a form of choose-your-own-adventure fun. (The Guardian)

10 We can’t stop trying to reinvent mousetraps 🧀
There are thousands of versions out there, yet we keep coming up with new designs. (New Yorker $)

Continue Reading

Tech

We can now use cells from dead people to create new life. But who gets to decide?

Published

on

We can now use cells from dead people to create new life. But who gets to decide?


His parents told a court that they wanted to keep the possibility of using the sperm to eventually have children that would be genetically related to Peter. The court approved their wishes, and Peter’s sperm was retrieved from his body and stored in a local sperm bank. 

We have the technology to use sperm, and potentially eggs, from dead people to make embryos, and eventually babies. And there are millions of eggs and embryos—and even more sperm—in storage and ready to be used. When the person who provided those cells dies, like Peter, who gets to decide what to do with them?

That was the question raised at an online event held by the Progress Educational Trust, a UK charity for people with infertility and genetic conditions, that I attended on Wednesday. The panel included a clinician and two lawyers, who addressed plenty of tricky questions, but provided few concrete answers. 

In theory, the decision should be made by the person who provided the eggs, sperm or embryos. In some cases, the person’s wishes might be quite clear. Someone who might be trying for a baby with their partner may store their sex cells or embryos and sign a form stating that they are happy for their partner to use these cells if they die, for example. 

But in other cases, it’s less clear. Partners and family members who want to use the cells might have to collect evidence to convince a court the deceased person really did want to have children. And not only that, but that they wanted to continue their family line without necessarily becoming a parent themselves.

Sex cells and embryos aren’t property—they don’t fall under property law and can’t be inherited by family members. But there is some degree of legal ownership for the people who provided the cells. It is complicated to define that ownership, however, Robert Gilmour, a family law specialist based in Scotland, said at the event. “The law in this area makes my head hurt,” he said.

The law varies depending on where you are, too. Posthumous reproduction is not allowed in some countries, and is unregulated in many others. In the US, laws vary by state. Some states won’t legally recognize a child conceived after a person’s death as that person’s offspring, according to the American Society for Reproductive Medicine (ASRM). “We do not have any national rules or policies,” Gwendolyn Quinn, a bioethicist at New York University, tells me.

Societies like ASRM have put together guidance for clinics in the meantime. But this can also vary slightly between regions. Guidance by the European Society for Human Reproduction and Embryology, for example, recommends that parents and other relatives should not be able to request the sex cells or embryos of the person who died. That would apply to Peter Zhu’s parents. The concern is that these relatives might be hoping for a “commemorative child” or as “a symbolic replacement of the deceased.”

Continue Reading

Copyright © 2021 Seminole Press.