Every year in early March, Open Data Day is celebrated around the globe. It’s a great opportunity to discuss how we can better navigate the world of AI and data.
Responsible AI and Data Use Starts with You!
The capacity for pretty much anyone to use GenerativeAI such as ChatGPT is raising a lot of questions about how this data-driven technology is leading us into legally and ethically grey areas. How do we navigate the realm of using AI and data responsibly?
Harmful Deep Fakes
Some things seem quite obviously wrong when it comes to the use of these technologies—for example, creating a deep fake intimate image of a person without their consent. This famously happened to Taylor Swift earlier this year, but it has been impacting many less well-known people for a few years now. Disproportionately, the creation and distribution of intimate deep fakes is something that harms women. Few laws protect against this situation although most people would say it's wrong. Given Swift’s notoriety, we might be seeing increased urgency to address this issue with new laws in the US and Canada.
Deep fakes are also being implicated in numerous scams. A scammer using video deepfakes of corporate executives recently tricked a finance worker into transferring $25 million in corporate funds. The ability to deep fake an entire management team on video is extreme, but the technology now exists to make this possible. Companies will need to change processes to put in place better internal controls to prevent these kinds of incidents.
Digital Resurrections
We can also consider the George Carlin deepfake and how it raises questions about the appropriate use of data. In this case, the publicly accessible but copyrighted work of the deceased comedian was used to train an AI system. The AI was able to mimic Carlin’s voice and comedic style while talking about topical issues for a work entitled, I’m Glad I’m Dead. There was no permission granted by Carlin’s estate to use the copyrighted data and the whole situation has caused trauma for the Carlin family who is suing the AI creators.
Carlin is a celebrity whose family has the means to pursue this matter in court. However, given the way technology has advanced, we may see regular people also being ‘resurrected’ in digital format. Increasingly, we’ll need to plan for our digital afterlife, spawning an emerging area of digital estate law.
Is it wrong to digitally resurrect someone?
The answer might depend on several factors including their wishes as outlined in their will, the wishes of their family and estate, how they will be represented in this new digital form, who controls the digital entity and who might be compensated or stand to gain from the digital entity.
In the case of Carlin, the data was used without permission, neither Carlin nor his family gave consent, and any gain from the project seemed to accrue to an unrelated third party - namely the podcasters behind the AI. It should be noted that creators now claim they did not use AI - rather they implied it was used only to claim now that they wrote the work themselves and did an impersonation of Carlin. In any case, the potential for this to happen remains.
AI has been used to digitally recreate a person’s voice or likeliness with proper authorization of data. For example, John Lennon’s voice was digitally recreated for a final Beatles song set by bandmate Paul McCartney using a demo given to him by Lennon’s widow Yoko Ono. When it comes to the appropriate use of data and AI, it's not just a question of technical feasibility. Relationships and context matter and need to be considered in ethical decisions about data use.
Data theft?
Generative AI itself is a technology made possible through the questionable acquisition of web-scraped data. Data that is available online has been taken without consent and used to train artificial intelligence. Some creators are losing their livelihoods and they’ve received no compensation from the use of their data. In addition, there are questions about the massive environmental footprint of large language models and about exploitative labour in the data supply chain used to prepare the training data. We’re in new legal territory which is being tested by several open copyright cases, including Sarah Silverman and the New York Times.
Where do you stand?
It’s easy to agree that we should not use AI to make a deep fake of someone that could cause them harm or to use AI to scam people. It’s prudent to think about your digital assets and to set up some guardrails now about how you want your data to be used (or not used) when you are gone.
However, ethically, how should we feel about using a technology that was created in a way that’s causing harm to artists through the unauthorized use of their data? Are we complicit when we prompt ChatGPT, Mid-Journey or any of these other generative AI systems? These are tough questions but they deserve our careful deliberation. They’re the questions we should all be asking ourselves as we seek to understand our role in creating a culture of Responsible AI.
Add a comment to: Responsible AI starts with You