Connect with us


Artificial intelligence, Geoffrey Hinton, neural network, GLOM, vectors, visual perception, human perception, intuition



Hinton face grid

Deep learning set off the latest AI revolution, transforming computer vision and the field as a whole. Hinton believes deep learning should be almost all that’s needed to fully replicate human intelligence.

But despite rapid progress, there are still major challenges. Expose a neural net to an unfamiliar data set or a foreign environment, and it reveals itself to be brittle and inflexible. Self-driving cars and essay-writing language generators impress, but things can go awry. AI visual systems can be easily confused: a coffee mug recognized from the side would be an unknown from above if the system had not been trained on that view; and with the manipulation of a few pixels, a panda can be mistaken for an ostrich, or even a school bus.

GLOM addresses two of the most difficult problems for visual perception systems: understanding a whole scene in terms of objects and their natural parts; and recognizing objects when seen from a new viewpoint.(GLOM’s focus is on vision, but Hinton expects the idea could be applied to language as well.)

An object such as Hinton’s face, for instance, is made up of his lively if dog-tired eyes (too many people asking questions; too little sleep), his mouth and ears, and a prominent nose, all topped by a not-too-untidy tousle of mostly gray. And given his nose, he is easily recognized even on first sight in profile view.

Both of these factors—the part-whole relationship and the viewpoint—are, from Hinton’s perspective, crucial to how humans do vision. “If GLOM ever works,” he says, “it’s going to do perception in a way that’s much more human-like than current neural nets.”

Grouping parts into wholes, however, can be a hard problem for computers, since parts are sometimes ambiguous. A circle could be an eye, or a doughnut, or a wheel. As Hinton explains it, the first generation of AI vision systems tried to recognize objects by relying mostly on the geometry of the part-whole-relationship—the spatial orientation among the parts and between the parts and the whole. The second generation instead relied mostly on deep learning—letting the neural net train on large amounts of data. With GLOM, Hinton combines the best aspects of both approaches.

“There’s a certain intellectual humility that I like about it,” says Gary Marcus, founder and CEO of Robust.AI and a well-known critic of the heavy reliance on deep learning. Marcus admires Hinton’s willingness to challenge something that brought him fame, to admit it’s not quite working. “It’s brave,” he says. “And it’s a great corrective to say, ‘I’m trying to think outside the box.’”

The GLOM architecture

In crafting GLOM, Hinton tried to model some of the mental shortcuts—intuitive strategies, or heuristics—that people use in making sense of the world. “GLOM, and indeed much of Geoff’s work, is about looking at heuristics that people seem to have, building neural nets that could themselves have those heuristics, and then showing that the nets do better at vision as a result,” says Nick Frosst, a computer scientist at a language startup in Toronto who worked with Hinton at Google Brain.

With visual perception, one strategy is to parse parts of an object—such as different facial features—and thereby understand the whole. If you see a certain nose, you might recognize it as part of Hinton’s face; it’s a part-whole hierarchy. To build a better vision system, Hinton says, “I have a strong intuition that we need to use part-whole hierarchies.” Human brains understand this part-whole composition by creating what’s called a “parse tree”—a branching diagram demonstrating the hierarchical relationship between the whole, its parts and subparts. The face itself is at the top of the tree, and the component eyes, nose, ears, and mouth form the branches below.

One of Hinton’s main goals with GLOM is to replicate the parse tree in a neural net—this would distinguish it from neural nets that came before. For technical reasons, it’s hard to do. “It’s difficult because each individual image would be parsed by a person into a unique parse tree, so we would want a neural net to do the same,” says Frosst. “It’s hard to get something with a static architecture—a neural net—to take on a new structure—a parse tree—for each new image it sees.” Hinton has made various attempts. GLOM is a major revision of his previous attempt in 2017, combined with other related advances in the field.

“I’m part of a nose!”

GLOM vector


A generalized way of thinking about the GLOM architecture is as follows: The image of interest (say, a photograph of Hinton’s face) is divided into a grid. Each region of the grid is a “location” on the image—one location might contain the iris of an eye, while another might contain the tip of his nose. For each location in the net there are about five layers, or levels. And level by level, the system makes a prediction, with a vector representing the content or information. At a level near the bottom, the vector representing the tip-of-the-nose location might predict: “I’m part of a nose!” And at the next level up, in building a more coherent representation of what it’s seeing, the vector might predict: “I’m part of a face at side-angle view!”

But then the question is, do neighboring vectors at the same level agree? When in agreement, vectors point in the same direction, toward the same conclusion: “Yes, we both belong to the same nose.” Or further up the parse tree. “Yes, we both belong to the same face.”

Seeking consensus about the nature of an object—about what precisely the object is, ultimately—GLOM’s vectors iteratively, location-by-location and layer-upon-layer, average with neighbouring vectors beside, as well as predicted vectors from levels above and below.

However, the net doesn’t “willy-nilly average” with just anything nearby, says Hinton. It averages selectively, with neighboring predictions that display similarities. “This is kind of well-known in America, this is called an echo chamber,” he says. “What you do is you only accept opinions from people who already agree with you; and then what happens is that you get an echo chamber where a whole bunch of people have exactly the same opinion. GLOM actually uses that in a constructive way.” The analogous phenomenon in Hinton’s system is those “islands of agreement.”

“Geoff is a highly unusual thinker…”

Sue Becker

“Imagine a bunch of people in a room, shouting slight variations of the same idea,” says Frosst—or imagine those people as vectors pointing in slight variations of the same direction. “They would, after a while, converge on the one idea, and they would all feel it stronger, because they had it confirmed by the other people around them.” That’s how GLOM’s vectors reinforce and amplify their collective predictions about an image.

GLOM uses these islands of agreeing vectors to accomplish the trick of representing a parse tree in a neural net. Whereas some recent neural nets use agreement among vectors for activation, GLOM uses agreement for representation—building up representations of things within the net. For instance, when several vectors agree that they all represent part of the nose, their small cluster of agreement collectively represents the nose in the net’s parse tree for the face. Another smallish cluster of agreeing vectors might represent the mouth in the parse tree; and the big cluster at the top of the tree would represent the emergent conclusion that the image as a whole is Hinton’s face. “The way the parse tree is represented here,” Hinton explains, “is that at the object level you have a big island; the parts of the object are smaller islands; the subparts are even smaller islands, and so on.”

Figure 2 from Hinton’s GLOM paper. The islands of identical vectors (arrows of the same color) at the various levels represent a parse tree.


According to Hinton’s long-time friend and collaborator Yoshua Bengio, a computer scientist at the University of Montreal, if GLOM manages to solve the engineering challenge of representing a parse tree in a neural net, it would be a feat—it would be important for making neural nets work properly. “Geoff has produced amazingly powerful intuitions many times in his career, many of which have proven right,” Bengio says. “Hence, I pay attention to them, especially when he feels as strongly about them as he does about GLOM.”

The strength of Hinton’s conviction is rooted not only in the echo chamber analogy, but also in mathematical and biological analogies that inspired and justified some of the design decisions in GLOM’s novel engineering.

“Geoff is a highly unusual thinker in that he is able to draw upon complex mathematical concepts and integrate them with biological constraints to develop theories,” says Sue Becker, a former student of Hinton’s, now a computational cognitive neuroscientist at McMaster University. “Researchers who are more narrowly focused on either the mathematical theory or the neurobiology are much less likely to solve the infinitely compelling puzzle of how both machines and humans might learn and think.”

Turning philosophy into engineering

So far, Hinton’s new idea has been well received, especially in some of the world’s greatest echo chambers. “On Twitter, I got a lot of likes,” he says. And a YouTube tutorial laid claim to the term “MeGLOMania.”

Hinton is the first to admit that at present GLOM is little more than philosophical musing (he spent a year as a philosophy undergrad before switching to experimental psychology). “If an idea sounds good in philosophy, it is good,” he says. “How would you ever have a philosophical idea that just sounds like rubbish, but actually turns out to be true? That wouldn’t pass as a philosophical idea.” Science, by comparison, is “full of things that sound like complete rubbish” but turn out to work remarkably well—for example, neural nets, he says.

GLOM is designed to sound philosophically plausible. But will it work?


Ring’s new TV show is a brilliant but ominous viral marketing ploy



Ring’s new TV show is a brilliant but ominous viral marketing ploy

Its market domination came, in no small part, as a result of Ring’s efforts, starting in 2016, to partner with law enforcement agencies. 

At various points, the company offered free cameras to individual officers, as well as entire departments, often in exchange for promoting Ring cameras in the officers’ jurisdictions. For a time, they also offered police partners a special portal to access community videos—stopping only after multiple media outlets reported on the process, which was followed by public outcry. And yet, that didn’t stop Ring’s policing problem; earlier this summer—in a response to a 2019 request for information from Senator Ed Markey, the company admitted to handing over video content to law enforcement without the video owner’s consent at least 11 times this year.  

“Everything Amazon does prioritizes growth, expansion, and reach,” says Chris Gilliard, a visiting scholar at Harvard Kennedy School Shorenstein Center and vocal critic of surveillance technologies. In that sense, “Ring Nation is best located along a continuum…this new initiative looks like an attempt to cement societal acceptance of Ring,” he adds. 

So now, Gilliard explains, it’s not surprising that the company is turning to a new strategy to further normalize surveillance.  

All in good “fun”

But these darker sides of surveillance technology will not form part of Ring Nation’s narrative. After all, they don’t exactly fit in with the show’s mission to give “friends and family a fun new way to enjoy time with one another,” as Ring founder, Jamie Siminoff, described in a press statement.  

Instead, in a self-enforcing cycle, the show will significantly expand the audience for Ring videos, the pool of potential Ring video creators, and then (and most importantly) the number of Ring cameras out in the wild. And many of these new customers likely won’t think twice about what their new Ring camera is really doing. 

“Ring prides itself on being incredibly accessible, [but] it’s still kind of a techie thing,” explains Guariglia of the Electronic Frontier Foundation. “But if you park your very non-techie relatives in front of the television all day, and they see the Funniest Home Videos from Ring Cameras, Ring might spread to an audience that perhaps Amazon has had a slower time getting on board.”

In other words, if the company has its way, Ring Nation, the television show, will bring us one step closer to a Ring nation, IRL. 

Continue Reading


How the idea of a “transgender contagion” went viral—and caused untold harm



How the idea of a “transgender contagion” went viral—and caused untold harm

The ROGD paper was not funded by anti-trans zealots. But it arrived at exactly the time people with bad intentions were looking for science to buoy their opinions.

The results were in line with what one might expect given those sources: 76.5% of parents surveyed “believed their child was incorrect in their belief of being transgender.” More than 85% said their child had increased their internet use and/or had trans friends before identifying as trans. The youths themselves had no say in the study, and there’s no telling if they had simply kept their parents in the dark for months or years before coming out. (Littman acknowledges that “parent-child conflict may also explain some of the findings.”) 

Arjee Restar, now an assistant professor of epidemiology at the University of Washington, didn’t mince words in her 2020 methodological critique of the paper. Restar noted that Littman chose to describe the “social and peer contagion” hypothesis in the consent document she shared with parents, opening the door for biases in who chose to respond to the survey and how they did so. She also highlighted that Littman asked parents to offer “diagnoses” of their child’s gender dysphoria, which they were unqualified to do without professional training. It’s even possible that Littman’s data could contain multiple responses from the same parent, Restar wrote. Littman told MIT Technology Review that “targeted recruitment [to studies] is a really common practice.” She also called attention to the corrected ROGD paper, which notes that a pro-gender-­affirming parents’ Facebook group with 8,000 members posted the study’s recruitment information on its page—although Littman’s study was not designed to be able to discern whether any of them responded.

But politics is blind to nuances in methodology. And the paper was quickly seized by those who were already pushing back against increasing acceptance of trans people. In 2014, a few years before Littman published her ROGD paper, Time magazine had put Laverne Cox, the trans actress from Orange Is the New Black, on its cover and declared a “transgender tipping point.” By 2016, bills across the country that aimed to bar trans people from bathrooms that fit their gender identity failed, and one that succeeded, in North Carolina, cost its Republican governor, Pat McCrory, his job.  

Yet by 2018 a renewed backlash was well underway—one that zeroed in on trans youth. The debate about trans youth competing in sports went national, as did a heavily publicized Texas custody battle between a mother who supported her trans child and a father who didn’t. Groups working to further marginalize trans people, like the Alliance Defending Freedom and the Family Research Council, began “printing off bills and introducing them to state legislators,” says Gillian Branstetter, a communications strategist at the American Civil Liberties Union.

The ROGD paper was not funded by anti-trans zealots. But it arrived at exactly the time people with bad intentions were looking for science to buoy their opinions. The paper “laundered what had previously been the rantings of online conspiracy theorists and gave it the resemblance of serious scientific study,” Branstetter says. She believes that if Littman’s paper had not been published, a similar argument would have been made by someone else. Despite its limitations, it has become a crucial weapon in the fight against trans people, largely through online dissemination. “It is astonishing that such a blatantly bad-faith effort has been taken so seriously,” Branstetter says.

Littman plainly rejects that characterization, saying her goal was simply to “find out what’s going on.” “This was a very good-faith attempt,” she says. “As a person I am liberal; I’m pro-LGBT. I saw a phenomenon with my own eyes and I investigated, found that it was different than what was in the scientific literature.” 

One reason for the success of Littman’s paper is that it validates the idea that trans kids are new. But Jules Gill-Peterson, an associate professor of history at Johns Hopkins and author of Histories of the Transgender Child, says that is “empirically untrue.” Trans children have only recently started to be discussed in mainstream media, so people assume they weren’t around before, she says, but “there have been children transitioning for as long as there has been transition-related medical technology,” and children were socially transitioning—living as a different gender without any medical or legal interventions—long before that.

Many trans people are young children when they first observe a dissonance between how they are identified and how they identify. The process of transitioning is never simple, but the explanation of their identity might be.

Continue Reading


Inside the software that will become the next battle front in US-China chip war



screenshot of KiCad software for circuit board design and prototyping

EDA software is a small but mighty part of the semiconductor supply chain, and it’s mostly controlled by three Western companies. That gives the US a powerful point of leverage, similar to the way it wanted to restrict access to lithography machines—another crucial tool for chipmaking—last month. So how has the industry become so American-centric, and why can’t China just develop its own alternative software? 

What is EDA?

Electronic design automation (also known as electronic computer-aided design, or ECAD) is the specialized software used in chipmaking. It’s like the CAD software that architects use, except it’s more sophisticated, since it deals with billions of minuscule transistors on an integrated circuit.

Screenshot of KiCad, a free EDA software.


There’s no single dominant software program that represents the best in the industry. Instead, a series of software modules are often used throughout the whole design flow: logic design, debugging, component placement, wire routing, optimization of time and power consumption, verification, and more. Because modern-day chips are so complex, each step requires a different software tool. 

How important is EDA to chipmaking?

Although the global EDA market was valued at only around $10 billion in 2021, making it a small fraction of the $595 billion semiconductor market, it’s of unique importance to the entire supply chain.

The semiconductor ecosystem today can be seen as a triangle, says Mike Demler, a consultant who has been in the chip design and EDA industry for over 40 years. On one corner are the foundries, or chip manufacturers like TSMC; on another corner are intellectual-property companies like ARM, which make and sell reusable design units or layouts; and on the third corner are the EDA tools. All three together make sure the supply chain moves smoothly.

From the name, it may sound as if EDA tools are only important to chip design firms, but they are also used by chip manufacturers to verify that a design is feasible before production. There’s no way for a foundry to make a single chip as a prototype; it has to invest in months of time and production, and each time, hundreds of chips are fabricated on the same semiconductor base. It would be an enormous waste if they were found to have design flaws. Therefore, manufacturers rely on a special type of EDA tool to do their own validation. 

What are the leading companies in the EDA industry?

There are only a few companies that sell software for each step of the chipmaking process, and they have dominated this market for decades. The top three companies—Cadence (American), Synopsys (American), and Mentor Graphics (American but acquired by the German company Siemens in 2017)—control about 70% of the global EDA market. Their dominance is so strong that many EDA startups specialize in one niche use and then sell themselves to one of these three companies, further cementing the oligopoly. 

What is the US government doing to restrict EDA exports to China?

US companies’ outsize influence on the EDA industry makes it easy for the US government to squeeze China’s access. In its latest announcement, it pledged to add certain EDA tools to its list of technologies banned from export. The US will coordinate with 41 other countries, including Germany, to implement these restrictions. 

Continue Reading

Copyright © 2021 Seminole Press.