The digital revolution is here, but not everyone is benefiting equitably from it. And as Silicon Valley’s ethos of “move fast and break things” spreads around the world, now is the time to pause and consider who is being left out and how we can better distribute the benefits of our new data economy. “Data is the main resource of a new digital economy,” says Parminder Singh, executive director at nonprofit organization IT for Change. Global society will benefit because the economy will benefit, argues Singh, on decentralization of data and distributed digital models. Data commons—or open data sources—are vital to help build an equitable digital economy, but with that comes the challenge of data governance.
“Not everybody is sharing data,” says Singh. Big tech companies are holding onto the data, which stymies the growth of an open data economy, but also the growth of society, education, science, in other words, everything. According to Singh, “Data is a non-rival resource. It’s not a material resource that if one uses it, other can’t use it.” Singh continues, “If all people can use the resource of data, obviously people can build value over it and the overall value available to the world, to a country, increases manifold because the same asset is available to everyone.”
One doesn’t have to look very far to understand the value of non-personal data collected to help the public, consider GIS data from government satellites. Innovation plus the open access to geographic data helped not only create the Internet we know today, but those same tech companies. And this is why Singh argues, “These powerful forces should be in the hands of people, in the hands of communities, they should be able to be influenced by regulators for public interest.” Especially now that most of the data is now collected by private companies.
IT for Change is tackling this with a research project called “Unskewing the Data Value Chain,” which is supported by Omidyar Network. The project aims to assess the current policy gaps and new policy directions on data value chains that can promote equitable and inclusive economic development. Singh explains, “Our goal is to ensure the value chains are organized in a manner where the distribution of value is fairer. All countries can digitally industrialize at if not an equal piece, but an equitable pace, and there is a better distribution of benefits from digitalization.”
Business Lab is hosted by Laurel Ruma, editorial director of Insights, the custom publishing division of MIT Technology Review. The show is a production of MIT Technology Review, with production help from Collective Next.
This podcast was produced in partnership with Omidyar Network.
Show notes and links
“Unskewing the Data Value Chain: A Policy Research Project for Equitable Platform Economies,” IT for Change, September 2020
“Treating data as commons”, The Hindu, Parminder Singh, September 2, 2020
“Report by the Committee of Experts on Non-Personal Data Governance Framework,” Ministry of Electronics and Information Technology, Government of India
“A plan for Indian self-sufficiancy in an AI-driven world,” Mint, Parminder Singh, July 29, 2020
Laurel Ruma: From MIT Technology Review, I’m Laurel Ruma, and this is Business Lab. The show that helps business leaders make sense of new technologies coming out of the lab and into the marketplace. Our topic today is data governance, and more specifically, how to balance data governance. The data collection of non-personalized data, and then open it up for citizens, businesses and/or government use. This is a global challenge. Currently, as more people go online, they stop having control over their data.
Two words for you: data commons.
My guest is Parminder Singh, the executive director of IT for Change. His expertise is IT for development, internet governance and e-governance in the digital economy. He has worked extensively with a number of United Nations groups, including the Internet Governance Forum and the Global Alliance for Information and Communication Technologies and Development. Parminder is part of the Government of India’s committee on non-personal data governance framework, which, has come out with recommendations for a law on this subject.
This episode of Business Lab is produced in association with Omidyar Network.
Parminder Singh: Thank you, Laurel.
Laurel: So, IT for Change is based in India, but your focus is how technology can improve humanity, globally, or at least not harm people. This is certainly a different perspective than the typical Silicon Valley startup ethos.
Parminder: Yes. We want digitalization to not cause harm and to benefit us, as it holds huge potential. We would think of a similar kind, like industrialization, improves all sectors, all aspects of human life. Digitalization has similar potential. I think about two decades back, the ethos of Silicon Valley was right. Their ethos moves fast, breaks things. They want to counter power to the powerful incumbents in different sectors, starting from telecom, to media, and later on in areas like transportation, shopping, health, education, etc.
So, they were the counter power, but no longer. For the last years, they represent the power. They are the most powerful players, increasingly in all sectors. Therefore, the ethos now is somewhat like, “leave things to us. You just take the services, you take the goodies, don’t ask us questions. We know everything.” That is not what we believe. We think that these powerful forces should be in the hands of people, in the hands of communities, they should be able to be influenced by regulators for public interest, and so on.
Laurel: When you move fast and break things, the third part isn’t, “And take care of other humans along the way,” is it?
Parminder: Absolutely. In one sense, it is okay to destroy the incumbents, but then you are not really looking at harm as you see right.
Laurel: So, IT for Change is working with Omidyar Network on some ambitious research, focused on studying the data economy in the nine countries of the global south. Could you tell us more about that?
Parminder: So, this project, which is named “Unskewing the data value chain”, is about looking at how the digital economy is organized currently around the value of data, and how this value chain, the data value chain, which we see as different from the industrial value chain, and I would come to it, presently. How it is organized, and what can be done so that digital value, the value from data, value from the intelligence derived from data, is more fairly distributed. It is put to use for purposes which people really want these things put to use for.
As you said, it is a nine-country project. These are developing countries. We are looking at how the value which comes from data and intelligence derived from data is put to best purposes, but also its value is equitably distributed. And we are looking at a range of policy options which the regulators could have. These range from traditional policy options in the area of competition policy and taxation, to new age digital policy options of data governments and putting in the right digital infrastructure, or as we call them, intelligence infrastructure. They range from the telecom infrastructure to cloud computing, to basic applications, which are available to everyone, or to data infrastructures and artificial intelligence infrastructure. So it’s a multi-year infrastructure. So how would you have the right kind of digital infrastructure policies and data governance, which is the modern side of it and the old traditional competition policies as well as taxation policies?
So how do we ensure that this new thing, the digital economy, is regulated in the best possible manner from a public interest viewpoint? And increasingly, it is not the industrial giants who are at the top of global value chains, not even intellectual property giants. Those forms which control the data of a sector and the digital intelligence, which comes from data off a given sector who are at the top of value chains—whether it’s transportation, health, education, media, including industrial production—and these actors didn’t mind the whole value chain. So our goal is to ensure the value chains are organized in a manner where the distribution of value is fairer. All countries can digitally industrialize at if not an equal piece, but an equitable pace, and there is a better distribution of benefits from digitalization, generally.
Laurel: So why those nine countries? What makes them more open or why is this opportunity there? And is it one of those things where we can do it now before the monopolies do set in?
Parminder: Actually, the choice was not determined necessarily by which countries are in a position to be able to do it. I think the choice was more about the researchers available to do work in all those four areas I mentioned. Competition policy, taxation policy, digital infrastructures, and data governance. So, we had an open call, we selected people. We did do a distribution balance between Latin American countries, African countries, and Asian countries—the developing world. But in general, it was not necessarily a choice of countries. It was an open call where people responded with their proposals. And yes, it has little to do with countries. Some countries have a better standing right now to be able to do something on the digital economy, but there is a balance between the choice of countries and the choice of researchers.
Laurel: So, when we think about the urgency of right now, why is data sharing needed? And how can it actually help build an equitable digital economy?
Parminder: Data—as people have often been saying, economists said it a few years back, but almost everybody says it now—is the main resource of a new digital economy. This data is valuable because it gives intelligence about whoever the data is about. It could be a person, and that data gives intelligence about that person, her behavior, her friends, her occupation, everything, her health. Or it could be about a bigger group, and that data gives intelligence for that particular community, that particular group and that data is intelligence for that particular community, that particular group, has become the most valuable asset. Now, why should it be the shared? It’s because economics says that there are two basic requirements. One is of growth and other is of distribution. Generally, these are the two things that economics focuses on. Now, sharing of data both meets the imperative of growth and of distribution because if the data is not locked up within silos and the data is available to all people in a sector, and as we know, data is a non-rival resource. It’s not a material resource that if one uses it, other can’t use it. If all people can use the resource of data, obviously people can build value over it and the overall value available to the world, to a country, increases manifold because the same asset is available to everyone.
But right now, all people who have had that asset, especially the large platforms, try to hold it, keep it to themselves and not make it available to others. Not everybody is sharing data. The size of the pie increases because people are able to have this huge resource. It’s like everybody who uses oil, which is an old major resource in the industrial economy, has no multiple times oil, but oil being a rival commodity cannot be shared in the same way as data can be. Data can be used by others without diminishing the value of it for you. This, of course, everybody knows. First of all, what happens is the total pie of value increases. We have better health services. We have better education services. We have better agriculture services. Everything.
Second is that once data sharing starts, you don’t have that kind of monopolies as we see today. Because most of these monopolies are based on exclusive access to the data which they collect. That does not allow the startups, the competitors, to come up because the distance between those who already collected the data and the ones who are starting to collect data is so huge that they’re never able to cover up that distance. The data sharing takes place. There’s also better distribution of economic power. And as I would probably come to later, if communities are owning that value, there is much better public interest control. Basically, we have a bigger bite of digital value, overall, and that pie is distributed better if data sharing takes place. It meets both the key imperatives of economics.
Laurel: Excellent. And we know that the more data that’s open and available, what is possible for innovation as well, so we’re making people’s lives better. The common example given is GIS or spatial data coming down from satellites. This was a government project and data is now available for everyone. Where would Google Maps or Waze or any of us be without this common dataset that is now available for everybody to use as they see fit? Now, of course, this is where the governance comes in, right? Because you want to be able to make sure that data is good and clean and updated and then, open in accessible formats.
Parminder: Yes. In this case, the very important data infrastructure, the first big data infrastructure, you rightly pointed out to, the global GIS. Data which was made available by the U.S. government to the world. It was a public agency which produced the data and by its own will made it available as a free infrastructure to everybody, and without that much of, a lot at least if not much, of digital economy activities would not have been possible, including the big digital firm Google.
Now, the problem is that most of the data is produced today over private platforms. These are the platforms like Google Facebook, Uber, Amazon, which provide digital services. Most of most of world’s data gets gathered in the same process of providing the digital service. The people who interact with those digital services leave their footprints and that is the biggest data source. These platforms act as data mines. The problem is that these are private data mines, which keeps on entrenching the advantage of the incumbents, almost in a geometric kind of growth. That’s the reason we see such monopolies in this area. A [new] company just simply has no chance because those who provide services daily get big new hordes of data using which they again provide better services, and they get more data, and this data becomes more privatized. That is the problem now.
First issue, and you were right talking what governance means about good data, the right kind of data, but that comes later. First of all, we need to get this data out of this private platform companies and make it available to everybody. Then, the issue comes about the quality of data. It’s the right kind of provisioning, prevention of harm, and those kinds of governance issues. But the first governance issue is how to get the data out of those private confines and make it generally available to everyone, and in that way, make a new kind of digital economy model where the main competitive advantage is not hauling of data but overshared data. Your competitive advantage is how can you use shared data to provide the best AI or the best digital service? Your competitive advantage shifts. Currently, it is in holding data. That is a major shift which would solve a lot of problems which are associated with this term.
Laurel: That’s a phenomenal goal. Having that mind shift, it’s better to share than it is to keep it for yourself. It is certainly a challenge for most private companies who, you are right, want to hoard the data, keep it to themselves. But how do governments themselves catch up and understand that they need to partner with companies, as well as intermediary non-government organizations, to create this trifecta of three organizations coming together for the greater good?
Parminder: The way you put the challenge is the right way to frame that challenge. It does not have easy answers, but we need to start moving in that direction and that is where the committee of which I am a member, the Indian government committee you mentioned, in whose recommendations have come up as the second or an almost near-final draft, which introduces this concept of community, which is the co-actor activity. We have been talking about the problems of the data being with these private monopolies, but there is also the problem of data being with the state.
Like in the case of physical infrastructures of industrial era, where these big infrastructures were controlled by the state, there is also the problem of whether all this data hoards which are not brought out, let’s say, by some kind of illegal enforcement, then who controls them? First of all, is to have some kind of legal mechanism of getting that private data hoards into a data common. What this committee does is institute first time anywhere, a community’s rights to its data, which means that even if a private company’s collecting health data about citizens of a city, the health data in its raw form, without the derivatives, in some way belongs to that common of that city. That collective of that city can ask for that raw data back. By law, it is their common property and that’s the right two words you used at the start of data commons.
This is a legal force. It’s not just voluntary persuasive effort to tell companies, “Well, you know, you’ll be better off if you share data,” which can only go so far. This committee recommends that since this data was taken from the community, community has a right to its data. It doesn’t stop you from using the data. You can carry on doing what you do, but certain data sets, which are considered of an infrastructural kind, will be required to be shared in a common pool. And once it is put into a common pool, then the issue comes up, who governs them? And there are some community trustees, community structures, which are being talked about which are possibly at an arm’s length from control of the state over that data.
Laurel: That’s really exciting, as someone who has been involved in open data, especially for governments for a number of years to have this kind of progress and this forward-thinking come along is really optimistic, and really puts into place that data in the aggregate has the most opportunity for a collective good. So how can we seize on that and make that promise to the collective good that we’re going to use this data, and that everyone can use this data? What are some examples of openly available non-personal data, that in the future, or maybe even now, you could see everyone having access to, whether they’re nonprofits or other technologists to build new things, or build non-technological startups?
Parminder: Let’s say in the health sector, there is data about lung scans of hundreds and thousands of lung cancer patients, which is available with many hospitals, many health companies. Which today keep that data and do a lot of analysis on that data to develop many kinds of medical possibilities on lung cancer. If all such data is available in a common pool, in an anonymized form, you understand what kind of patterns can emerge.
First of all, the patterns which emerge in smaller silos are not as complete versus the patterns which will emerge if all the data is put together. That is the first benefit. And second, when all the data’s put together, all kinds of medical researchers are working on it. So, A, may make certain progress, and B, may make another progress, all of them working together on making medical progress to treat lung cancer, is kind of an immediate multiple times gain, which you can see just because the health data has been shared in a non-personal data form.
That is true even of transportation data. If all data about traffic conditions in the city, road conditions, traffic density, events, taking place in different parts of the city, are all available in a common pool, then many kinds of transport services can be developed because of it. Right now, that data is largely available by one or two mega-players who give transport services, who would therefore keep on adding more and more possibilities over their offerings, because they are the only ones who can do it. And soon enough, they are the transport giant of a city or a country, and you really can’t do anything. Even a state enterprise cannot meet the might of that digital transportation company. That’s true with agriculture data, education data. Any sector, once you put the data together, people can develop services on the top of it.
Laurel: And when we talk about people too—by opening the data, creating it, and putting into this data common where anyone can access it—it’s not just technologists. Artists, teachers, anyone who has an idea of what is possible with this data can look at ways to make the entire city better, for example when you’re looking at traffic data and perhaps crosswalks and safety. But Parminder, how do we both share the data and ensure privacy, so everyone is protected, whether it is the community or the government body, etc.? So, everyone’s country can grow in this data open economy.
Parminder: So yes, again, these challenges will take many decades to finally be sorted out, but the right start must be made. That’s the kind of things we were talking about, the concept of community data, communities right to get data into commons, setting up community trusts, who set up data infrastructures as technical systems, which provide safe access to data. Still, the types of problems you are talking about, once you start doing things, there will be hundreds of possibilities. This committee’s report already talk about how a community member can just save that certain uses of data causes a community harm. And the group can go to the court and go to a non-personal data protection authority and prove that there is a possibility of harm.
So those kinds of possibilities are already mentioned at the concept level, but how exactly it gets done is an enormous challenge. I am not undermining or minimizing the enormousness of that challenge, but once you have the data under control of community trust, which are neutral bodies, I think things would start somehow.
Laurel: Yeah. And I think it’s fair to say it’s okay that it’s an enormous challenge, because look where we are now in just a few decades with internet technology, etc. So how about you tell us a little bit more about the Government of India’s non-personal data governance act. What were the goals? How did you all come together to have a fair kind of idea in mind, that the country of India really needed to have something like this? The EU recently released its own draft data governance act. So it’s clear the time is now. Are you following the footsteps of the EU, or are you saying: regardless, it’s time for India to have its own start on this process that could take decades?
Parminder: Yes. Good you mentioned the EU data governance act, and they also have this digital market act, which has some data governance possibilities, which are very promising. We have been engaging with it. Just last week, I did a 12-page response to the European process, which was asking for feedback on the data governance act. And in that paper, I compare the Indian approach and the European approach, and I find certain gaps in both, and interestingly, the two complement each other quite well. Some of the gaps of the European approach are very well sealed up by the Indian approach, and vice versa. So that is interesting.
And why, and what motivated India to start this kind of thing is a similar motivation that Europe feels. Countries outside U.S. and China feel that they are fast losing out in the geopolitical and geo-economic digital race. There’s increasing feeling that the world would become bipolar between U.S. and China, and almost all global artificial intelligence (AI) will be at one of these two centers. And from these centers, the whole of the world would be controlled, economics of all sectors, but also social, cultural, and maybe political aspects. That kind of fear motivates Europe. And you can read the statements of European leaders about how they continually feel that they are going to be reduced to a third-world country status in the digital space. And countries like India do have certain IT prowess, IT capabilities, but they do not own their own IT platforms. They see a possibility that if they take the rights steps towards data governance, and later towards AI governance and other digital infrastructures, they can have a proportionate place in the global digital economy.
So that was a primary motivation for this committee’s work, but of course it was also the issue of prevention of connected harm to communities. Personal harm is often talked about. There are personal data protections, there are many kinds of collective harms which cannot be calibrated by individuals. So, the concept of collective community harm, that was another motivation. So, these were the two motivations, but I would admit that the geo-economic was the stronger one to start.
Laurel: That’s fascinating, because you in your career have also worked so closely with the United Nations developing data governance. How do we back off this idea that it’s an arms race, and instead, look at it as a community good and a reduction of community harms?
Parminder: Yes, at the global level, I think, nothing is perfect. United Nations is not perfect, and everyone agrees to that. It is even less perfect when staged that together and decide things when you’re talking about digital and the internet, which is so new age. Also, there is a problem of status data controls. Having said all of these, the only doubly democratic way to at least start talking about some collective norms. It’s not that globally there will be a law which will dictate what the United States does or India does. That’s not the kind of work the UN does. And it’s not like UNESCO controls education in India or the U.S., Or even WHO controls health services. It helps countries to do those kinds of things that develop some common norms, certain common thinking, some common values.
The similar kind of work needs to be done with a UN agency on digital governance. We have been in this struggle for at least last 15 years. There was a world summit on information society in 2005, which has a mandate to set up some kind of global platform for internet governance. That was the word at that time, but now we more talk about digital governance and data governance. But we do meet a globally democratic UN based system where discussions could take this long to develop. We have been fighting for that. More developing countries have been asking for a platform like that. Developed countries have tried to promote private sector led government’s mechanisms in this area. But now the last few years, the U.S. is starting to feel that private sector leadership for governance is not enough and the state has to step in. I think even in the U.S., there is a bigger, greater recognition now than earlier that you need the states to come in also in this area.
We have been asking for some time of a UN-based body looking at digital governance. Meanwhile, we also work with the WTO. We work with UN Conference on trade and development. We work with WHO. We work with food agriculture organization with regards to data and digital issues which connect to the areas of what they do. There’s a lot of work that we do globally ourselves as IT for Change, and we are also a part of a global coalition called Just Net Coalition, which has organizations from all continents who also tried to do these engagements. As we agreed, this is a long haul, but we need to start digging.
Laurel: Because to bring it back to what this is about, it’s about creating a fair economy for people around the world. We’re not just talking about autonomous vehicles. We’re also talking about access to food and water and health services and basic data needs that helps get those human needs to people.
Parminder: Yes, absolutely. Because when data is closer in control of communities and cities and states and real people are able to make decision about what the data and intelligence coming out of the data would do, the kind of things you talked about gets prioritized. It’s not necessary that we need to have a shinier telephone in our hands with improved camera every three months or six months. Sometimes the kind of things you talked about, food requirements, water, climate, change, these are the important things. Once these powerful digital resources of digital intelligence and data are in the hands of people in communities, then these decisions get taken while we will also be improving our transportation and we would like to have better phones in our hands. But then, the decision-making about what is important for the society and community, it’s more democratized. Yes, these are the kinds of things which would begin to happen if the control of data and digital intelligence is put in the hands of people and countries.
Laurel: That phrase, democratizing data, that’s where you see the power of it and the strength of it and the whole purpose of it. Parminder, when you think about the long road that we still have to go, what makes you optimistic about our data economy today and what’s possible for the future?
Parminder: Optimism comes from the righteousness of people, of politicians and businesses. I mean, there is much better understanding today than it was five years ago, that there is a need of regulation. There is a need of decentralization of power and more distributive digital models. I think sometimes the pace at which the problems grow as they have been growing in the digital area also helps designate things. I see, and you have been talking about, the kind of data governance work happening in the EU and some now in developing countries like India, just give us optimism that society will take control of their future and just not accept the Big Tech formula of leave things to us–you just enjoy the goodies—that, I think, is over.
Laurel: Parminder Singh, thank you so much for joining us today on The Business Lab.
Parminder: Thank you so much, Laurel. It was my pleasure to be talking to you.
Laurel: That was Parminder Singh, the executive director of IT for Change, who I spoke with from Cambridge, Massachusetts, the home of MIT and MIT Technology Review overlooking the Charles River.
That’s it for this episode of Business Lab. I’m your host, Laurel Ruma. I’m the director of Insights, the custom publishing division of MIT Technology Review. We were founded in 1899 at the Massachusetts Institute of Technology. You can also find this inference on the web and at events each year around the world.
For more information about us and the show, please check out our website at technologyreview.com. This show is available wherever you get your podcasts. If you enjoyed this episode, we hope you’ll take a moment to rate and review us. Business Lab is a production of MIT Technology Review. This episode was produced by Collective Next. Thanks for listening.
This podcast episode was produced by Insights, the custom content arm of MIT Technology Review. It was not produced by MIT Technology Review’s editorial staff.
The AI myth Western lawmakers get wrong
This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.
While the US and the EU may differ on how to regulate tech, their lawmakers seem to agree on one thing: the West needs to ban AI-powered social scoring.
As they understand it, social scoring is a practice in which authoritarian governments—specifically China—rank people’s trustworthiness and punish them for undesirable behaviors, such as stealing or not paying back loans. Essentially, it’s seen as a dystopian superscore assigned to each citizen.
The EU is currently negotiating a new law called the AI Act, which will ban member states, and maybe even private companies, from implementing such a system.
The trouble is, it’s “essentially banning thin air,” says Vincent Brussee, an analyst at the Mercator Institute for China Studies, a German think tank.
Back in 2014, China announced a six-year plan to build a system rewarding actions that build trust in society and penalizing the opposite. Eight years on, it’s only just released a draft law that tries to codify past social credit pilots and guide future implementation.
There have been some contentious local experiments, such as one in the small city of Rongcheng in 2013, which gave every resident a starting personal credit score of 1,000 that can be increased or decreased by how their actions are judged. People are now able to opt out, and the local government has removed some controversial criteria.
But these have not gained wider traction elsewhere and do not apply to the entire Chinese population. There is no countrywide, all-seeing social credit system with algorithms that rank people.
As my colleague Zeyi Yang explains, “the reality is, that terrifying system doesn’t exist, and the central government doesn’t seem to have much appetite to build it, either.”
What has been implemented is mostly pretty low-tech. It’s a “mix of attempts to regulate the financial credit industry, enable government agencies to share data with each other, and promote state-sanctioned moral values,” Zeyi writes.
Kendra Schaefer, a partner at Trivium China, a Beijing-based research consultancy, who compiled a report on the subject for the US government, couldn’t find a single case in which data collection in China led to automated sanctions without human intervention. The South China Morning Post found that in Rongcheng, human “information gatherers” would walk around town and write down people’s misbehavior using a pen and paper.
The myth originates from a pilot program called Sesame Credit, developed by Chinese tech company Alibaba. This was an attempt to assess people’s creditworthiness using customer data at a time when the majority of Chinese people didn’t have a credit card, says Brussee. The effort became conflated with the social credit system as a whole in what Brussee describes as a “game of Chinese whispers.” And the misunderstanding took on a life of its own.
The irony is that while US and European politicians depict this as a problem stemming from authoritarian regimes, systems that rank and penalize people are already in place in the West. Algorithms designed to automate decisions are being rolled out en masse and used to deny people housing, jobs, and basic services.
For example in Amsterdam, authorities have used an algorithm to rank young people from disadvantaged neighborhoods according to their likelihood of becoming a criminal. They claim the aim is to prevent crime and help offer better, more targeted support.
But in reality, human rights groups argue, it has increased stigmatization and discrimination. The young people who end up on this list face more stops from police, home visits from authorities, and more stringent supervision from school and social workers.
It’s easy to take a stand against a dystopian algorithm that doesn’t really exist. But as lawmakers in both the EU and the US strive to build a shared understanding of AI governance, they would do better to look closer to home. Americans do not even have a federal privacy law that would offer some basic protections against algorithmic decision making.
There is also a dire need for governments to conduct honest, thorough audits of the way authorities and companies use AI to make decisions about our lives. They might not like what they find—but that makes it all the more crucial for them to look.
A bot that watched 70,000 hours of Minecraft could unlock AI’s next big thing
Research company OpenAI has built an AI that binged on 70,000 hours of videos of people playing Minecraft in order to play the game better than any AI before. It’s a breakthrough for a powerful new technique, called imitation learning, that could be used to train machines to carry out a wide range of tasks by watching humans do them first. It also raises the potential that sites like YouTube could be a vast and untapped source of training data.
Why it’s a big deal: Imitation learning can be used to train AI to control robot arms, drive cars, or navigate websites. Some people, such as Meta’s chief AI scientist, Yann LeCun, think that watching videos will eventually help us train an AI with human-level intelligence. Read Will Douglas Heaven’s story here.
Bits and Bytes
Meta’s game-playing AI can make and break alliances like a human
Diplomacy is a popular strategy game in which seven players compete for control of Europe by moving pieces around on a map. The game requires players to talk to each other and spot when others are bluffing. Meta’s new AI, called Cicero, managed to trick humans to win.
It’s a big step forward toward AI that can help with complex problems, such as planning routes around busy traffic and negotiating contracts. But I’m not going to lie—it’s also an unnerving thought that an AI can so successfully deceive humans. (MIT Technology Review)
We could run out of data to train AI language programs
The trend of creating ever bigger AI models means we need even bigger data sets to train them. The trouble is, we might run out of suitable data by 2026, according to a paper by researchers from Epoch, an AI research and forecasting organization. This should prompt the AI community to come up with ways to do more with existing resources. (MIT Technology Review)
Stable Diffusion 2.0 is out
The open-source text-to-image AI Stable Diffusion has been given a big facelift, and its outputs are looking a lot sleeker and more realistic than before. It can even do hands. The pace of Stable Diffusion’s development is breathtaking. Its first version only launched in August. We are likely going to see even more progress in generative AI well into next year.
Human creators stand to benefit as AI rewrites the rules of content creation
A game-changer for content creation
Among the AI-related technologies to have emerged in the past several years is generative AI—deep-learning algorithms that allow computers to generate original content, such as text, images, video, audio, and code. And demand for such content will likely jump in the coming years—Gartner predicts that by 2025, generative AI will account for 10% of all data created, compared with 1% in 2022.
“Théâtre D’opéra Spatial” is an example of AI-generated content (AIGC), created with the Midjourney text-to-art generator program. Several other AI-driven art-generating programs have also emerged in 2022, capable of creating paintings from single-line text prompts. The diversity of technologies reflects a wide range of artistic styles and different user demands. DALL-E 2 and Stable Diffusion, for instance, are focused mainly on western-style artwork, while Baidu’s ERNIE-ViLG and Wenxin Yige produce images influenced by Chinese aesthetics. At Baidu’s deep learning developer conference Wave Summit+ 2022, the company announced that Wenxin Yige has been updated with new features, including turning photos into AI-generated art, image editing, and one-click video production.
Meanwhile, AIGC can also include articles, videos, and various other media offerings such as voice synthesis. A technology that generates audible speech indistinguishable from the voice of the original speaker, voice synthesis can be applied in many scenarios, including voice navigation for digital maps. Baidu Maps, for example, allows users to customize its voice navigation to their own voice just by recording nine sentences.
Recent advances in AI technologies have also created generative language models that can fluently compose texts with just one click. They can be used for generating marketing copy, processing documents, extracting summaries, and other text tasks, unlocking creativity that other technologies such as voice synthesis have failed to tap. One of the leading generative language models is Baidu’s ERNIE 3.0, which has been widely applied in various industries such as health care, education, technology, and entertainment.
“In the past year, artificial intelligence has made a great leap and changed its technological direction,” says Robin Li, CEO of Baidu. “Artificial intelligence has gone from understanding pictures and text to generating content.” Going one step further, Baidu App, a popular search and newsfeed app with over 600 million monthly users, including five million content creators, recently released a video editing feature that can produce a short video accompanied by a voiceover created from data provided in an article.
Improving efficiency and growth
As AIGC becomes increasingly common, it could make content creation more efficient by getting rid of repetitive, time-intensive tasks for creators such as sorting out source assets and voice recordings and rendering images. Aspiring filmmakers, for instance, have long had to pay their dues by spending countless hours mastering the complex and tedious process of video editing. AIGC may soon make that unnecessary.
Besides boosting efficiency, AIGC could also increase business growth in content creation amid rising demand for personalized digital content that users can interact with dynamically. InsightSLICE forecasts that the global digital creation market will on average grow 12% annually between 2020 and 2030 and hit $38.2 billion. With content consumption fast outpacing production, traditional development methods will likely struggle to meet such increasing demand, creating a gap that could be filled by AIGC. “AI has the potential to meet this massive demand for content at a tenth of the cost and a hundred times or thousands of times faster in the next decade,” Li says.
AI with humanity as its foundation
AIGC can also serve as an educational tool by helping children develop their creativity. StoryDrawer, for instance, is an AI-driven program designed to boost children’s creative thinking, which often declines as the focus in their education shifts to rote learning.
The Download: the West’s AI myth, and Musk v Apple
While the US and the EU may differ on how to regulate tech, their lawmakers seem to agree on one thing: the West needs to ban AI-powered social scoring.
As they understand it, social scoring is a practice in which authoritarian governments—specifically China—rank people’s trustworthiness and punish them for undesirable behaviors, such as stealing or not paying back loans. Essentially, it’s seen as a dystopian superscore assigned to each citizen.
The reality? While there have been some contentious local experiments with social credit scores in China, there is no countrywide, all-seeing social credit system with algorithms that rank people.
The irony is that while US and European politicians try to ban systems that don’t really exist, systems that do rank and penalize people are already in place in the West—and are denying people housing and jobs in the process. Read the full story.
Melissa’s story is from The Algorithm, her weekly AI newsletter covering all of the industry’s most interesting developments. Sign up to receive it in your inbox every Monday.
I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.
1 Apple has reportedly threatened to pull Twitter from the App Store
According to Elon Musk. (NYT $)
+ Musk has threatened to “go to war” with the company after it decided to stop advertising on Twitter. (WP $)
+ Apple’s reluctance to advertise on Twitter right now isn’t exactly unique. (Motherboard)
+ Twitter’s child protection team in Asia has been gutted. (Wired $)
2 Another crypto firm has collapsed
Lender BlockFi has filed for bankruptcy, and is (partly) blaming FTX. (WSJ $)
+ The company is suing FTX founder Sam Bankman-Fried. (FT $)
+ It looks like the much-feared “crypto contagion” is spreading. (NYT $)
3 AI is rapidly becoming more powerful—and dangerous
That’s particularly worrying when its growth is too much for safety teams to handle. (Vox)
+ Do AI systems need to come with safety warnings? (MIT Technology Review)
+ This AI chat-room game is gaining a legion of fans. (The Guardian)
4 A Pegasus spyware investigation is in danger of being compromised
It’s the target of a disinformation campaign, security experts have warned. (The Guardian)
+ Cyber insurance won’t protect you from theft of your data. (The Guardian)
5 Google gave the FBI geofence data for its January 6 investigation
Google identified more than 5,000 devices near the US Capitol during the riot. (Wired $)
6 Monkeypox isn’t going anywhere
But it’s not on the rise, either. (The Atlantic $)
+ The World Health Organization says it will now be known as mpox. (BBC)
+ Everything you need to know about the monkeypox vaccines. (MIT Technology Review)
7 What it’s like to be the unwitting face of a romance scam
James Scott Geras’ pictures have been used to catfish countless women. (Motherboard)