Generative AI like ChatGPT consumes copyrighted digital content during training, leading to a sharp rise in legal actions about AI copyright infringement.
There is a solution to the problem. Read on.
The Lawsuits
Recently The New York Times filed a lawsuit against ChatGPT’s OpenAI for using content that falls under the Times' paywall. In response, OpenAI has predictably refuted the claims, asserting that the lawsuit is “without merit.”
In another case, The Authors Guild is suing OpenAI in a class action lawsuit. The Guild represents authors such as David Baldacci, John Grisham, The Lincoln Lawyer riter Michael Connelly, and George R.R. Martin, the author of Game of Thrones.
Yet another lawsuit has been initiated by Reuters against the AI company Ross Intelligence, while Microsoft’s generative AI faces legal action regarding computer code it writes.
In the realm of AI image generation, Getty Images has taken legal action against Stability AI for employing more than 12 million of its copyrighted images in the training of a AI image generator. Notably, there are smoking guns where the Getty logo is distorted but clearly visible in images generated by Stability AI.
According to Fortune, the number of AI-related lawsuits has surpassed 100, with many of them centered on what Noam Chomsky has referred to as "high-tech plagiarism."
Possible Legal Decisions
So what will be the outcome of the litigation?
On one extreme, courts could rule that generative AI must take its hands off of all copyrighted material. If this happens, generative AI will go the way of Napster. Remember Napster?
Twenty-five years ago, a file sharing app dubbed Napster was birthed. It revolutionized the way people shared and downloaded music over the internet and allowed users to connect their computers to share music files.
I used it. Almost everybody connected to the internet did. Everyone was listening to music without any compensation being given to the artists or the songwriters.
For a brief period, the world had free access to most of the music ever recorded. But Napster's rapid rise in popularity sparked legal battles with the music industry. Record companies and artists argued that Napster was guilty of copyright infringement on a massive scale.
In 2000, a federal court agreed and ruled against Napster, finding the company liable for copyright infringement. In 2001, Napster was forced to shut down its freebee service.
Will generative artificial intelligence like today’s freebee ChatGPT share a similar short window of availability? Will ChatGPT be shut down because it infringes on the copyrights of others?
Without copyrighted material, generative AI will be rendered impotent.
The Telegraph quotes an Open AI source saying: “Because copyright today covers virtually every sort of human expression … it would be impossible to train today’s leading [generative] AI models [like ChatGPT] without using copyrighted materials.” Then “Limiting training data to [material not copyrighted] …would not provide AI systems that meet the needs of today’s citizens.”
On the other extreme, courts could rule generative AI can use copyrighted material under US copyright fair use criteria. Fair use requires transformative alteration of copyright material.
In a recent Supreme Court ruling, we read: "This Court has repeatedly made clear that a work of art is 'transformative' for purposes of fair use under the Copyright Act if it conveys a different 'meaning or message' from its source material."
Does ChatGPT convey a “different meaning” than the material it uses for training? That’s for the courts to decide.
Solution
A potential solution lies in finding a middle ground. OpenAI recognizes the importance of compensating data creators, engaging in negotiations with entities like the Associated Press for monetary compensation.
However, dealing individually with large companies may disadvantage smaller creators lacking the means to engage Big AI legally.
Looking back at the failed Napster and today’s successful Spotify, there's lessons to be learnt.
Performing rights organizations like BMI, ASCAP and SESAC collect and distribute royalties to songwriters and artists, regardless of their fame.
During my time as a DJ in the 1970s, there were intervals of time where I was a required to document every song I broadcasted on the air to submit to such organizations. My radio stations paid a fee that was distributed to the song makers.
Today’s Spotify keeps automatic records of song frequency and, from subscriber’s payments, distributes royalties accordingly.
Similar methods could be applied to compensate content creators by generative AI. It’s not possible to trace every AI output to its sources, so AI content generators would need to record and report their use of copyrighted material.
But generative AI companies like OpenAI currently tend to keep their processes secretive. A compromise where they identify and agree to pay royalties for copyrighted material usage could potentially resolve copyright disputes.
But there is no free lunch. Napster was free. Spotify isn’t. If generative AI begins to pay royalties, no more freebees. We users will probably also pay.
Robert J. Marks Ph.D. is Distinguished Professor at Baylor University and Senior Fellow and Director of the Bradley Center for Natural & Artificial Intelligence. He is author of "Non-Computable You: What You Do That Artificial Intelligence Never Will Never Do," and "Neural Smithing." Marks is former Editor-in-Chief of the IEEE Transactions on Neural Networks. Read more Dr. Marks' reports — Here.