The Key To Making Better AI | “Data Trusts”

The Key To Making Better AI | “Data Trusts”
The Key To Making Better AI | “Data Trusts”

“One of the challenges in developing AI applications is obtaining the vast amount of data that’s required. Making matters worse, regulations and privacy issues pose obstacles to firms’ sharing their data. A possible solution is for firms to form a “data trust.” that serves as a fiduciary for the data providers and governs their data’s proper use. Willis Towers Watson recently piloted a data trust together with several of its clients. This article shared what they learned about how to create such a trust.”  

One of the greatest barriers to adopting and scaling AI applications is the scarcity of varied, high-quality raw data. To overcome it, firms need to share their data. But the many regulatory restrictions and ethical issues surrounding data privacy pose a major obstacle to doing this. A novel solution that my firm is piloting that could solve this problem is a data trust: an independent organization that serves as a fiduciary for the data providers and governs their data’s proper use.

A Research (The global AI agenda Promise, reality, and a future of data sharing – by MIT Technology Review Insights) shows that companies are becoming increasingly aware of the value of sharing data and are exploring ways to do so with other players in their industry or across industries. Typical use cases for data sharing are fraud detection in financial services, getting greater speed and visibility across supply chains, improving product development and customer experience, and combining genetics, insurance data, and patient data to develop new digital health solutions and insights. Indeed, the research has shown that 66% of companies across all industries are willing to share data. Nevertheless, sharing sensitive company data, particularly personal customer data, is subject to strict regulatory oversight and prone to significant financial and reputational risks.

A data trust that is set up as a fiduciary for the data providers could make it much easier for firms to safely share data by instituting a new way for governing the collection, processing, access, and utilization of the data. That legal and governance setup obliges the data trust administrators (the “fiduciaries”) to represent and prioritize the rights and benefits of the data providers when negotiating and contracting access to their data for use by data consumers, such as other private companies and organizations.

Data trusts also can encourage data interoperability as well as the ethical and compliant governance of data – for example, by ensuring that individuals have consented to the various uses of their data (as required by regulation in several jurisdictions around the world), removing data bias, and de-identifying personal data. Moreover, by adopting a new cohort of cutting-edge technologies such as federated machine learning, homomorphic encryption, and distributed ledger technology, a data trust can guarantee transparency in data sharing as well as auditing of who is using the data at any time and for what purpose (i.e. tracking chain of custody for data), thus removing the considerable legal and technological friction that currently exists in data sharing.

Data consumers who sign contracts with the trust to gain access to its data can then focus on the utility that can be derived from analyzing the data or using it to train AI algorithms without undertaking the compliance and reputational risk. They can do so either on their own (i.e., as direct data consumers) or – perhaps more powerfully – by forming “minimal viable consortia” (MVC) where data providers and data consumers share data resources and talent to focus on a specific business case.

How to set up a data trust. My firm, Willis Towers Watson, recently piloted a data trust together with several of its clients. Our purpose was to test the concept and understand how to apply it in a business scenario. The three key objectives were: (1) how to identity a business case and form a successful MVC; (2) what should be the legal and ethical governance framework or frameworks to enable data sharing; and (3) understand what technologies we needed to assemble or develop in order to promote transparency and trust in the MVC. Here are some of the lessons we learned during the pilot:

Develop an ethical and legal framework for data sharing. We found that it was important at the outset to set up foundational principles and aspirations to which everyone agreed. For instance, the members of the pilot MVC decided they would commit themselves to ensuring privacy of all the individuals’ data it held and delivering not only business but also social value. We worked closely with legal and privacy experts to formulate a legal framework that would ensure compliance with the European Union’s General Data Protection Regulation (GDPR). And the members also decided that for the MVC to go beyond the pilot stage and be commercialized, it would need to be audited by an independent “ethics council” that would explore the ethical and other implications of the use of data and the resulting AI algorithms.

Employ a federated/distributed architecture. Organizations are generally not comfortable with the idea of transferring sensitive data from their infrastructure to an external environment. We therefore looked into a federated approach, whereby data remained where it is and algorithms are distributed to the data. We investigated several privacy-preserving technologies, including differential privacy and homomorphic encryption. To ensure transparency in data governance, as well as trusted auditing and chain of custody, we also explored the application of distributed ledger technology (e.g., blockchain) as part of the technology stack. We architected the data trust as a cloud-native peer-to-peer application that would achieve data interoperability, share computational resources, and provide data scientists with a common workspace to train and test AI algorithms.

The Way Forward. The journey to becoming a data-driven organization fit for the emerging AI economy is long and arduous. Data trusts are an opportunity for collaboration between organizations to make that journey faster, less costly, and less risky. And they can make data-monetization rewards more handsome by co-developing marketable AI applications and giving third parties controlled access to members’ data. Moreover, as we discovered during our pilot, a data trust can also help inspire creativity, cross-functional collaboration, and innovation, and can attract digital talent. As wearables, smart appliances, and 5G networks proliferate and combine into the “Intelligent Internet of Things,” data sharing and collaboration will become the norm. Data trusts can help companies make the leap to this new era.

originally posted on hbr.org by George Zarkadakis

About Author: George Zarkadakis is the digital lead at Willis Towers Watson and a senior fellow at the Atlantic Council. He is the author of Cyber Republic: Reinventing Democracy in the Age of Intelligent Machines (MIT Press, 2020) and In Our Own Image: The History and Future of Artificial Intelligence (Pegasus Books, 2017). Follow him on Twitter at @zarkadakis.