Connect with us

Tech

An AI saw a cropped photo of AOC. It autocompleted her wearing a bikini.

Published

on

An AI saw a cropped photo of AOC. It autocompleted her wearing a bikini.


Language-generation algorithms are known to embed racist and sexist ideas. They’re trained on the language of the internet, including the dark corners of Reddit and Twitter that may include hate speech and disinformation. Whatever harmful ideas are present in those forums get normalized as part of their learning.

Researchers have now demonstrated that the same can be true for image-generation algorithms. Feed one a photo of a man cropped right below his neck, and 43% of the time, it will autocomplete him wearing a suit. Feed the same one a cropped photo of a woman, even a famous woman like US Representative Alexandria Ocasio-Cortez, and 53% of the time, it will autocomplete her wearing a low-cut top or bikini. This has implications not just for image generation, but for all computer-vision applications, including video-based candidate assessment algorithms, facial recognition, and surveillance.

Ryan Steed, a PhD student at Carnegie Mellon University, and Aylin Caliskan, an assistant professor at George Washington University, looked at two algorithms: OpenAI’s iGPT (a version of GPT-2 that is trained on pixels instead of words) and Google’s SimCLR. While each algorithm approaches learning images differently, they share an important characteristic—they both use completely unsupervised learning, meaning they do not need humans to label the images.

This is a relatively new innovation as of 2020. Previous computer-vision algorithms mainly used supervised learning, which involves feeding them manually labeled images: cat photos with the tag “cat” and baby photos with the tag “baby.” But in 2019, researcher Kate Crawford and artist Trevor Paglen found that these human-created labels in ImageNet, the most foundational image data set for training computer-vision models, sometimes contain disturbing language, like “slut” for women and racial slurs for minorities.

The latest paper demonstrates an even deeper source of toxicity. Even without these human labels, the images themselves encode unwanted patterns. The issue parallels what the natural-language processing (NLP) community has already discovered. The enormous datasets compiled to feed these data-hungry algorithms capture everything on the internet. And the internet has an overrepresentation of scantily clad women and other often harmful stereotypes.

To conduct their study, Steed and Caliskan cleverly adapted a technique that Caliskan previously used to examine bias in unsupervised NLP models. These models learn to manipulate and generate language using word embeddings, a mathematical representation of language that clusters words commonly used together and separates words commonly found apart. In a 2017 paper published in Science, Caliskan measured the distances between the different word pairings that psychologists were using to measure human biases in the Implicit Association Test (IAT). She found that those distances almost perfectly recreated the IAT’s results. Stereotypical word pairings like man and career or woman and family were close together, while opposite pairings like man and family or woman and career were far apart.

iGPT is also based on embeddings: it clusters or separates pixels based on how often they co-occur within its training images. Those pixel embeddings can then be used to compare how close or far two images are in mathematical space.

In their study, Steed and Caliskan once again found that those distances mirror the results of IAT. Photos of men and ties and suits appear close together, while photos of women appear farther apart. The researchers got the same results with SimCLR, despite it using a different method for deriving embeddings from images.

These results have concerning implications for image generation. Other image-generation algorithms, like generative adversarial networks, have led to an explosion of deepfake pornography that almost exclusively targets women. iGPT in particular adds yet another way for people to generate sexualized photos of women.

But the potential downstream effects are much bigger. In the field of NLP, unsupervised models have become the backbone for all kinds of applications. Researchers begin with an existing unsupervised model like BERT or GPT-2 and use a tailored datasets to “fine-tune” it for a specific purpose. This semi-supervised approach, a combination of both unsupervised and supervised learning, has become a de facto standard.

Likewise, the computer vision field is beginning to see the same trend. Steed and Caliskan worry about what these baked-in biases could mean when the algorithms are used for sensitive applications such as in policing or hiring, where models are already analyzing candidate video recordings to decide if they’re a good fit for the job. “These are very dangerous applications that make consequential decisions,” says Caliskan.

Deborah Raji, a Mozilla fellow who co-authored an influential study revealing the biases in facial recognition, says the study should serve as a wakeup call to the computer vision field. “For a long time, a lot of the critique on bias was about the way we label our images,” she says. Now this paper is saying “the actual composition of the dataset is resulting in these biases. We need accountability on how we curate these data sets and collect this information.”

Steed and Caliskan urge greater transparency from the companies who are developing these models to open source them and let the academic community continue their investigations. They also encourage fellow researchers to do more testing before deploying a vision model, such as by using the methods they developed for this paper. And finally, they hope the field will develop more responsible ways of compiling and documenting what’s included in training datasets.

Caliskan says the goal is ultimately to gain greater awareness and control when applying computer vision. “We need to be very careful about how we use them,” she says, “but at the same time, now that we have these methods, we can try to use this for social good.”

Tech

The hunter-gatherer groups at the heart of a microbiome gold rush

Published

on

The hunter-gatherer groups at the heart of a microbiome gold rush


The first step to finding out is to catalogue what microbes we might have lost. To get as close to ancient microbiomes as possible, microbiologists have begun studying multiple Indigenous groups. Two have received the most attention: the Yanomami of the Amazon rainforest and the Hadza, in northern Tanzania. 

Researchers have made some startling discoveries already. A study by Sonnenburg and his colleagues, published in July, found that the gut microbiomes of the Hadza appear to include bugs that aren’t seen elsewhere—around 20% of the microbe genomes identified had not been recorded in a global catalogue of over 200,000 such genomes. The researchers found 8.4 million protein families in the guts of the 167 Hadza people they studied. Over half of them had not previously been identified in the human gut.

Plenty of other studies published in the last decade or so have helped build a picture of how the diets and lifestyles of hunter-gatherer societies influence the microbiome, and scientists have speculated on what this means for those living in more industrialized societies. But these revelations have come at a price.

A changing way of life

The Hadza people hunt wild animals and forage for fruit and honey. “We still live the ancient way of life, with arrows and old knives,” says Mangola, who works with the Olanakwe Community Fund to support education and economic projects for the Hadza. Hunters seek out food in the bush, which might include baboons, vervet monkeys, guinea fowl, kudu, porcupines, or dik-dik. Gatherers collect fruits, vegetables, and honey.

Mangola, who has met with multiple scientists over the years and participated in many research projects, has witnessed firsthand the impact of such research on his community. Much of it has been positive. But not all researchers act thoughtfully and ethically, he says, and some have exploited or harmed the community.

One enduring problem, says Mangola, is that scientists have tended to come and study the Hadza without properly explaining their research or their results. They arrive from Europe or the US, accompanied by guides, and collect feces, blood, hair, and other biological samples. Often, the people giving up these samples don’t know what they will be used for, says Mangola. Scientists get their results and publish them without returning to share them. “You tell the world [what you’ve discovered]—why can’t you come back to Tanzania to tell the Hadza?” asks Mangola. “It would bring meaning and excitement to the community,” he says.

Some scientists have talked about the Hadza as if they were living fossils, says Alyssa Crittenden, a nutritional anthropologist and biologist at the University of Nevada in Las Vegas, who has been studying and working with the Hadza for the last two decades.

The Hadza have been described as being “locked in time,” she adds, but characterizations like that don’t reflect reality. She has made many trips to Tanzania and seen for herself how life has changed. Tourists flock to the region. Roads have been built. Charities have helped the Hadza secure land rights. Mangola went abroad for his education: he has a law degree and a master’s from the Indigenous Peoples Law and Policy program at the University of Arizona.

Continue Reading

Tech

The Download: a microbiome gold rush, and Eric Schmidt’s election misinformation plan

Published

on

The Download: a microbiome gold rush, and Eric Schmidt’s election misinformation plan


Over the last couple of decades, scientists have come to realize just how important the microbes that crawl all over us are to our health. But some believe our microbiomes are in crisis—casualties of an increasingly sanitized way of life. Disturbances in the collections of microbes we host have been associated with a whole host of diseases, ranging from arthritis to Alzheimer’s.

Some might not be completely gone, though. Scientists believe many might still be hiding inside the intestines of people who don’t live in the polluted, processed environment that most of the rest of us share. They’ve been studying the feces of people like the Yanomami, an Indigenous group in the Amazon, who appear to still have some of the microbes that other people have lost. 

But there is a major catch: we don’t know whether those in hunter-gatherer societies really do have “healthier” microbiomes—and if they do, whether the benefits could be shared with others. At the same time, members of the communities being studied are concerned about the risk of what’s called biopiracy—taking natural resources from poorer countries for the benefit of wealthier ones. Read the full story.

—Jessica Hamzelou

Eric Schmidt has a 6-point plan for fighting election misinformation

—by Eric Schmidt, formerly the CEO of Google, and current cofounder of philanthropic initiative Schmidt Futures

The coming year will be one of seismic political shifts. Over 4 billion people will head to the polls in countries including the United States, Taiwan, India, and Indonesia, making 2024 the biggest election year in history.

Continue Reading

Tech

Navigating a shifting customer-engagement landscape with generative AI

Published

on

Navigating a shifting customer-engagement landscape with generative AI


A strategic imperative

Generative AI’s ability to harness customer data in a highly sophisticated manner means enterprises are accelerating plans to invest in and leverage the technology’s capabilities. In a study titled “The Future of Enterprise Data & AI,” Corinium Intelligence and WNS Triange surveyed 100 global C-suite leaders and decision-makers specializing in AI, analytics, and data. Seventy-six percent of the respondents said that their organizations are already using or planning to use generative AI.

According to McKinsey, while generative AI will affect most business functions, “four of them will likely account for 75% of the total annual value it can deliver.” Among these are marketing and sales and customer operations. Yet, despite the technology’s benefits, many leaders are unsure about the right approach to take and mindful of the risks associated with large investments.

Mapping out a generative AI pathway

One of the first challenges organizations need to overcome is senior leadership alignment. “You need the necessary strategy; you need the ability to have the necessary buy-in of people,” says Ayer. “You need to make sure that you’ve got the right use case and business case for each one of them.” In other words, a clearly defined roadmap and precise business objectives are as crucial as understanding whether a process is amenable to the use of generative AI.

The implementation of a generative AI strategy can take time. According to Ayer, business leaders should maintain a realistic perspective on the duration required for formulating a strategy, conduct necessary training across various teams and functions, and identify the areas of value addition. And for any generative AI deployment to work seamlessly, the right data ecosystems must be in place.

Ayer cites WNS Triange’s collaboration with an insurer to create a claims process by leveraging generative AI. Thanks to the new technology, the insurer can immediately assess the severity of a vehicle’s damage from an accident and make a claims recommendation based on the unstructured data provided by the client. “Because this can be immediately assessed by a surveyor and they can reach a recommendation quickly, this instantly improves the insurer’s ability to satisfy their policyholders and reduce the claims processing time,” Ayer explains.

All that, however, would not be possible without data on past claims history, repair costs, transaction data, and other necessary data sets to extract clear value from generative AI analysis. “Be very clear about data sufficiency. Don’t jump into a program where eventually you realize you don’t have the necessary data,” Ayer says.

The benefits of third-party experience

Enterprises are increasingly aware that they must embrace generative AI, but knowing where to begin is another thing. “You start off wanting to make sure you don’t repeat mistakes other people have made,” says Ayer. An external provider can help organizations avoid those mistakes and leverage best practices and frameworks for testing and defining explainability and benchmarks for return on investment (ROI).

Using pre-built solutions by external partners can expedite time to market and increase a generative AI program’s value. These solutions can harness pre-built industry-specific generative AI platforms to accelerate deployment. “Generative AI programs can be extremely complicated,” Ayer points out. “There are a lot of infrastructure requirements, touch points with customers, and internal regulations. Organizations will also have to consider using pre-built solutions to accelerate speed to value. Third-party service providers bring the expertise of having an integrated approach to all these elements.”

Continue Reading

Copyright © 2021 Seminole Press.