Several authors are suing the companies making the chatbots marketed as “AI” for using their copyrighted material without permission to create that software. I don’t know if this litigation will be successful, but I know that it should be.
We are all entitled to read books and learn from them. However, if I want to use an idea from someone’s book in my own work, I have to give them credit.
This is why we have footnotes and bibliographies in nonfiction. This is why we credit lines of poetry or songs by other people in stories. And this is why you have to pay the creator if you’re going to do more than use a small amount of their work and point people to the original.
Someone did a lot of work to create that story or essay or poem or song or whatever material you’re referring to. They deserve credit and if you’re going to use a whole lot of what they did, they deserve to be paid.
It’s very simple.
One of the many real issues with the large language model chatbots is that they were developed using materials available online, both pictures and words, but the developers refuse to tell us what materials were used. They claim it’s proprietary.
But it’s very obvious that they are using stories and art created by specific people, because if you ask one of those bots to draw you a picture in the style of a specific artist or to write a story in the style of a specific writer, they can do it.
It’s not just famous writers and artists either, much less people who are long dead and whose work is out of copyright. Several of my friends have tried it and had it create works that sound plausibly similar to their own.
If software can “write” a story that sounds like something I would do, they must have incorporated my work into their database. That, to me, is the equivalent of stealing my work and publishing it as your own.
I don’t know if the interpreters of copyright law will agree, but it’s certainly worth trying.
I note that the chatbot companies say they are “training” the bots on this material, but that word would only be appropriate if the bots were, in fact, some kind of intelligent being. They’re not. They’re a repository of data that has been developed to regurgitate information with simple prompts.
This is impressive software development and likely has some valuable uses, but they are overselling it as artificial intelligence. It can’t think and therefore it cannot replace artists and writers and others who work with images and words and ideas, because it will never do something new with them.
It will, however, be great for the office memos that no one reads anyway. I don’t think most of the authors of such memos claim copyright protection, so I see no real problem with the chatbots scarfing up that data.
But the real point — and a real problem — is that these chatbots are built using materials created by other people who are not being compensated for the use.
More importantly, no one asked permission to use that material. I mean, I am sure some things I’ve written are in there. I’ve been blogging and otherwise writing things online for a long time. Many of my stories have been published online and most of my books are available as ebooks.
If they had asked, I would have categorically refused to give them permission to use anything I have ever written to build their datasets. In fact, let me make that clear right here and now: Open AI, Google, and all the rest of you: you do not have permission to use anything I have ever written in developing your chatbots.
I don’t just want to get paid when people use my work. I want to reserve the right to say no, you can’t. I’m not in the business of creating data sets for software companies. (And chatbots are, of course, software.)
Of course, I am in that business because the companies take our materials whether they have permission or not.
This is one of the many reasons why we need transparency from these companies on the materials they have used in building their bots. I can think of others. I mean, if they are using the speeches of Hitler or George Wallace or other fascists and white supremacists to build their software, we need to know that, too.
The companies keep trying to divert our attention by pointing to their fantasies of existential risk of artificial intelligence. We need to bring them back into the real world of the real harm they’re doing.
I have no objection to building good digital tools for the future. I have a lot of objection to ripping people off to build them and not making them in ethical ways that deal with the real dangers they might cause.
I am, in general, opposed to moving fast and breaking things.
Saturday Morning Breakfast Cereal has an excellent take on this issue: https://www.smbc-comics.com/comic/copyright
And keep your Bots off my face, too. Even if SAG-AFTRA gets the studios to back down from their position that there’s nothing wrong with hiring a background actor once and using their image ever after in crowd scenes in other works, that won’t necessarily shut that door. Your likeness could be scraped from social media and used to decorate the AI-generated body of a background character. Coming soon to your metroplex…
Of all the issues I see with the large language models, the fact that they’re using data without permission or compensation with the goal of making the person whose data (including pictures) they used irrelevant is the most striking. I’m not afraid that large language models are going to turn into the Terminator, but I am afraid of corporate moguls.
BTW, The NY Times has a piece on fan fiction authors and others bringing suit over this. Outside the paywall.
https://www.nytimes.com/2023/07/15/technology/artificial-intelligence-models-chat-data.html?unlocked_article_code=U21Age7f-kcLV8dNQvee6ZjuFg1xplrbRD8dvCHN-1hZ5tZzr35rVDhdwHVLPEGVWlDdtsC8nBvAj7aC42Qqgp0RsHa_SRiVOxQODgy14109btNvQmuiNq4jCFNlcvPlap4eNMAE1WhE4gn7t-NIJHQnXcvvhAjTK5hzeX830r1rvyvqhB1nScXAmAPnlmXdilAPT_I2YNpvN0TORLFmmi5fFfVqOJgsC1OFMEOwujHEvaDjmaGRZPr8uW5ITHWUU1PAkfPbYsqVCDJSQ9RpDMPJ_pNJOQC3JtBIL8yRuJnWjzOTKncRZdNe6oIgpdmFWD06qCHytUcsWvUHic1NDvP2vByuvYzQqoPQz_JBdz0is1ZrkA&smid=url-share