The big data era has created valuable resources for public interest outcomes, like health care. In the last 18 months, the speed with which scientists were able to respond to the covid-19 pandemic—faster than any other disease in history—demonstrated the benefits of gathering, sharing, and extracting value from data for a wider good.
Access to data from 56 million National Health Service (NHS) patients’ medical records enabled public health researchers in the UK to provide some of the strongest data on risk factors for covid mortality and features of long covid, and access to health records sped up the development of lifesaving medical treatments like the messenger-RNA vaccines produced by Moderna and Pfizer.
But balancing the benefits of data sharing with the protection of individual and organizational privacy is a delicate process—and rightly so. Governments and businesses are increasingly collecting vast amounts of data, prompting investigations, concerns around privacy, and calls for stricter regulation.
“Data increasingly powers innovation, and it needs to be used for the public good, while individual privacy is protected. This is new and unfamiliar terrain for policymaking, and it requires a careful approach,” wrote David Deming, professor and director of the Malcolm Wiener Center for Social Policy at the Harvard Kennedy School, in a recent New York Times article.
A growing number of startups—some 230 and counting, according to Data Collaboratives—are forming to help empower citizens, nonprofit groups, and governments to gain more control over their data.
These startups are adopting legal and institutional structures like data trusts, cooperatives, and stewards to help provide people and organizations with a means of effectively and securely gathering and using relevant data—and in the process, taking on Big Tech’s control of the data economy.
“The relationship between data and society is fundamentally broken,” says Matt Gee, CEO of Brighthive, which helps networks and organizations set up alternative governance models including data trusts, data commons, and data cooperatives.
“We think it should be more collaborative instead of competitive, it should be more open and transparent, it should be more distributed and democratic instead of monopolistic. This is how we make the gains more equitable and reduce harmful biases in data.”
Access and control
As demonstrated by the pandemic, medical research and public health planning can be enriched by access to electronic health records, prescription and medicines data, and epidemiology. But health data are also highly sensitive, with understandable public scrutiny over efforts to share them. So-called “secondary use,” which applies personal health information for uses outside health-care delivery, requires a new governance framework.
Findata is an independent authority in the Finnish Institute of Health and Welfare, established by a government act in May 2019. The agency facilitates researchers’ access to Finnish health data, issuing permits for use or responding to specific statistical requests. In so doing, it aims to protect the interests of citizens while also appreciating the value that their data could offer to medical research, teaching, and health planning.
Prior to the formation of Findata, it was costly and complex for researchers to access this vital research resource. “The purpose of this agency is to streamline and secure the use of health data,” explains Johanna Seppänen, director of Findata.
“Before, if you wanted to have data from different registers or hospitals, you had to make four applications, and there were no standard ways of handling them, no ways to determine prices. It was very time-consuming, difficult, and confusing.”
Findata is the only agency of its kind so far, but it might inspire other countries that want to realize more value from health data in a safe and secure way.
The UK’s NHS recently faced pushback from privacy campaigners over reforms to improve data sharing for public health planning, showing the challenges that can come from attempts to change data collection and sharing protocols.
Empowerment and autonomy
Helping disenfranchised individuals and groups has been another focus area for new data governance organizations.
Data stewards—which range from community-based collectives to public or private organizations—serve as “both intermediaries and guardians during the exchange of data, thereby supporting individuals and communities to better navigate the data economy and better negotiate on their data rights,” says Suha Mohamed, strategy and partnerships associate at Aapti, an organization working on the intersection of technology and society with a focus on data rights.
One example of where data stewards can prove useful is for individuals in the gig economy, a fast-growing labor market that has been characterized by the prevalence of short-term contracts or freelance work, as opposed to permanent jobs, and has been rife with power inequalities.
“Asymmetric control of data is one of the primary levers of power that gig platforms use to manage their workforce and shape the narrative and public policy in the arena that they operate in,” says Hays Witt, co-founder and CEO of Driver’s Seat, a driver-owned data cooperative specializing in ride-hailing.
“Very few stakeholders have access to the data they need to engage in productive and constructive ways, starting with gig workers themselves. Our premise [at Driver’s Seat] is: let’s use tech and a data cooperative to empower gig workers to collect, aggregate, and share their data,” says Witt.
Driver’s Seat has developed a proprietary app through which workers can submit their location, work, and earnings information, which is then aggregated and analyzed. Drivers then receive insights that help them understand their real earnings and performance, informing their choices about where, when, on what platforms, and on what terms to work.
Driver’s Seat is developing tools that can tell drivers their average real pay across platforms in their city, compare their pay with averages, and tell them whether their pay is going up or down. All of this could help drivers move to platforms that offer them a better deal, empowering what is an otherwise atomized labor force.
“Our drivers are really excited to be engaged, because their day-to-day experience is seeing metrics, fed back to them by the platforms, that they don’t trust,” says Witt. “They know that the metrics are influential, their day-to-day experience is totally mediated by data. It impacts their earnings and their life, and they know it.”
Witt believes that in the future, workers will increasingly be able to contribute to crowdsourced information to develop “collective analyses of their problems, which means they can put forward collective policy solutions or agreements to negotiate with the employment platform.”
Balancing social mission and business models
All data startups, whether they are government-sanctioned institutions like Findata or entrepreneurial businesses like Driver’s Seat, face the challenge of balancing their mission with operational sustainability.
Securing a sustainable financial footing is a major challenge for nonprofit groups and social impact businesses. For data equity institutions, the funding mix commonly includes community- and membership-driven approaches, and philanthropic aid.
But some organizations, like Brighthive, have found win-win models where private sector companies are looking to improve data governance and are willing to pay for it.
Brighthive’s Gee describes commercial clients who have “seen what’s happening in the European Union around AI regulation and they want to get ahead of it in the US. They are taking a proactive stance on issues like algorithmic transparency, equity audits, and an alternative governance model for how they use customer data.”
Other data equity platforms have found revenue models in which beneficiary data can be harnessed by third parties in socially positive ways. Hays Witt at Driver’s Seat cites the example of municipal authorities and planning agencies.
Both the authorities and ride-hailing drivers have an incentive to reduce “dead time” in which a driver is circulating without earning money, causing emissions and congestion. If appropriate data can be collected, aggregated, and analyzed in a useful way, it can lead to better traffic and mobility decisions and infrastructure interventions. So, all participants benefit.
Witt points out other “neutral” cases where beneficiary data could be valuable to unrelated private sector entities in ways that do not work against the interests of the drivers. He gives the example of commercial real estate developers who are often forced to make decisions about investments and services based on out-of-date traffic and mobility data.
Driver’s Seat is exploring opportunities to offer aggregated analytics products to such companies with revenues returned as dividends to gig workers and to help finance the cooperative.
Many data startups seeking out sustainable revenue opportunities need to decide where to draw the line in terms of the kind of work they are willing to take on or the kind of businesses they’re willing to work with.
Brighthive’s Matt Gee points to growing investor interest in startups that can help companies navigate the end of “cookies,” which have been critical to third-party advertising but are now being phased out. “Investors are concerned about the death of third-party data and are hungry for companies addressing that,” he says.
But as socially minded startups gain more business from corporate clients, they need to balance their mission for social good with the financial gain of lucrative contracts.
“Is being a public benefit corporation more about what you do and how you do it, or who you work with? If we work on a data collaborative that provides transparency and accountability for marketing organizations pooling customer lists, are we actually reducing societal harm? These are questions that our team is constantly grappling with,” says Gee.
Data startups will inevitably face challenges, including balancing social mission, ethics, and business models, but as the data economy continues to grow, they are in a unique position to carve out new ways of responsibly leveraging the insight that data can provide for citizens, organizations, and governments—wresting some of the power over data away from Big Tech.
“Our data economy needs to anchor on creating value for everyone in society, and that requires user control, trusted intermediation, and collective governance to be embedded in innovative data stewardship models,” says Sushant Kumar, principal of responsible technology at social change venture Omidyar Network.
“Onboarding a critical mass of users, receiving regulatory support, and achieving financial sustainability will also ensure these designs succeed in disrupting the status quo and injecting fairness into the current paradigm.”
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.
The hunter-gatherer groups at the heart of a microbiome gold rush
The first step to finding out is to catalogue what microbes we might have lost. To get as close to ancient microbiomes as possible, microbiologists have begun studying multiple Indigenous groups. Two have received the most attention: the Yanomami of the Amazon rainforest and the Hadza, in northern Tanzania.
Researchers have made some startling discoveries already. A study by Sonnenburg and his colleagues, published in July, found that the gut microbiomes of the Hadza appear to include bugs that aren’t seen elsewhere—around 20% of the microbe genomes identified had not been recorded in a global catalogue of over 200,000 such genomes. The researchers found 8.4 million protein families in the guts of the 167 Hadza people they studied. Over half of them had not previously been identified in the human gut.
Plenty of other studies published in the last decade or so have helped build a picture of how the diets and lifestyles of hunter-gatherer societies influence the microbiome, and scientists have speculated on what this means for those living in more industrialized societies. But these revelations have come at a price.
A changing way of life
The Hadza people hunt wild animals and forage for fruit and honey. “We still live the ancient way of life, with arrows and old knives,” says Mangola, who works with the Olanakwe Community Fund to support education and economic projects for the Hadza. Hunters seek out food in the bush, which might include baboons, vervet monkeys, guinea fowl, kudu, porcupines, or dik-dik. Gatherers collect fruits, vegetables, and honey.
Mangola, who has met with multiple scientists over the years and participated in many research projects, has witnessed firsthand the impact of such research on his community. Much of it has been positive. But not all researchers act thoughtfully and ethically, he says, and some have exploited or harmed the community.
One enduring problem, says Mangola, is that scientists have tended to come and study the Hadza without properly explaining their research or their results. They arrive from Europe or the US, accompanied by guides, and collect feces, blood, hair, and other biological samples. Often, the people giving up these samples don’t know what they will be used for, says Mangola. Scientists get their results and publish them without returning to share them. “You tell the world [what you’ve discovered]—why can’t you come back to Tanzania to tell the Hadza?” asks Mangola. “It would bring meaning and excitement to the community,” he says.
Some scientists have talked about the Hadza as if they were living fossils, says Alyssa Crittenden, a nutritional anthropologist and biologist at the University of Nevada in Las Vegas, who has been studying and working with the Hadza for the last two decades.
The Hadza have been described as being “locked in time,” she adds, but characterizations like that don’t reflect reality. She has made many trips to Tanzania and seen for herself how life has changed. Tourists flock to the region. Roads have been built. Charities have helped the Hadza secure land rights. Mangola went abroad for his education: he has a law degree and a master’s from the Indigenous Peoples Law and Policy program at the University of Arizona.
The Download: a microbiome gold rush, and Eric Schmidt’s election misinformation plan
Over the last couple of decades, scientists have come to realize just how important the microbes that crawl all over us are to our health. But some believe our microbiomes are in crisis—casualties of an increasingly sanitized way of life. Disturbances in the collections of microbes we host have been associated with a whole host of diseases, ranging from arthritis to Alzheimer’s.
Some might not be completely gone, though. Scientists believe many might still be hiding inside the intestines of people who don’t live in the polluted, processed environment that most of the rest of us share. They’ve been studying the feces of people like the Yanomami, an Indigenous group in the Amazon, who appear to still have some of the microbes that other people have lost.
But there is a major catch: we don’t know whether those in hunter-gatherer societies really do have “healthier” microbiomes—and if they do, whether the benefits could be shared with others. At the same time, members of the communities being studied are concerned about the risk of what’s called biopiracy—taking natural resources from poorer countries for the benefit of wealthier ones. Read the full story.
Eric Schmidt has a 6-point plan for fighting election misinformation
—by Eric Schmidt, formerly the CEO of Google, and current cofounder of philanthropic initiative Schmidt Futures
The coming year will be one of seismic political shifts. Over 4 billion people will head to the polls in countries including the United States, Taiwan, India, and Indonesia, making 2024 the biggest election year in history.
Navigating a shifting customer-engagement landscape with generative AI
A strategic imperative
Generative AI’s ability to harness customer data in a highly sophisticated manner means enterprises are accelerating plans to invest in and leverage the technology’s capabilities. In a study titled “The Future of Enterprise Data & AI,” Corinium Intelligence and WNS Triange surveyed 100 global C-suite leaders and decision-makers specializing in AI, analytics, and data. Seventy-six percent of the respondents said that their organizations are already using or planning to use generative AI.
According to McKinsey, while generative AI will affect most business functions, “four of them will likely account for 75% of the total annual value it can deliver.” Among these are marketing and sales and customer operations. Yet, despite the technology’s benefits, many leaders are unsure about the right approach to take and mindful of the risks associated with large investments.
Mapping out a generative AI pathway
One of the first challenges organizations need to overcome is senior leadership alignment. “You need the necessary strategy; you need the ability to have the necessary buy-in of people,” says Ayer. “You need to make sure that you’ve got the right use case and business case for each one of them.” In other words, a clearly defined roadmap and precise business objectives are as crucial as understanding whether a process is amenable to the use of generative AI.
The implementation of a generative AI strategy can take time. According to Ayer, business leaders should maintain a realistic perspective on the duration required for formulating a strategy, conduct necessary training across various teams and functions, and identify the areas of value addition. And for any generative AI deployment to work seamlessly, the right data ecosystems must be in place.
Ayer cites WNS Triange’s collaboration with an insurer to create a claims process by leveraging generative AI. Thanks to the new technology, the insurer can immediately assess the severity of a vehicle’s damage from an accident and make a claims recommendation based on the unstructured data provided by the client. “Because this can be immediately assessed by a surveyor and they can reach a recommendation quickly, this instantly improves the insurer’s ability to satisfy their policyholders and reduce the claims processing time,” Ayer explains.
All that, however, would not be possible without data on past claims history, repair costs, transaction data, and other necessary data sets to extract clear value from generative AI analysis. “Be very clear about data sufficiency. Don’t jump into a program where eventually you realize you don’t have the necessary data,” Ayer says.
The benefits of third-party experience
Enterprises are increasingly aware that they must embrace generative AI, but knowing where to begin is another thing. “You start off wanting to make sure you don’t repeat mistakes other people have made,” says Ayer. An external provider can help organizations avoid those mistakes and leverage best practices and frameworks for testing and defining explainability and benchmarks for return on investment (ROI).
Using pre-built solutions by external partners can expedite time to market and increase a generative AI program’s value. These solutions can harness pre-built industry-specific generative AI platforms to accelerate deployment. “Generative AI programs can be extremely complicated,” Ayer points out. “There are a lot of infrastructure requirements, touch points with customers, and internal regulations. Organizations will also have to consider using pre-built solutions to accelerate speed to value. Third-party service providers bring the expertise of having an integrated approach to all these elements.”