A Recent Entrance to Complexity

The United States Copyright Office recently reaffirmed its position that it will not register AI-generated content, because it is not created by a human. The rule is easy to state; the devil is in the details. Attorney Thomas James explains.

Last year, the United States Copyright Office issued a copyright registration to Kristina Kashtanova for the graphic novel, Zarya of the Dawn. A month later, the Copyright Office issued a notice of cancellation of the registration, along with a request for additional information.

The Copyright Office, consistent with judicial decisions, takes the position that copyright requires human authorship. The Office requested additional information regarding the creative process that resulted in the novel because parts of it were AI-generated. Kashtanova complied with the request for additional information.

This week, the Copyright Office responded with a letter explaining that the registration would be cancelled, but that a new, more limited one will be issued. The Office explained that its concern related to the author’s use of Midjourney, an AI-powered image generating tool, to generate images used in the work:

Because Midjourney starts with randomly generated noise that evolves into a final image, there is no guarantee that a particular prompt will generate any particular visual output”

U.S. Copyright Office letter

The Office concluded that the text the author wrote, as well as the author’s selection, coordination and arrangement of written and visual elements, are protected by copyright, and therefore may be registered. The images generated by Midjourney, however, would not be registered because they were “not the product of human authorship.” The new registration will cover only the text and editing components of the work, not the AI-generated images.

A Previous Entrance to Paradise

Early last year, the Copyright Office refused copyright registration for an AI-generated image. Steven Thaler had filed an application to register a copyright in an AI-generated image called “A Recent Entrance to Paradise.” He listed himself as the copyright owner. The Copyright Office denied registration on the grounds that the work lacked human authorship. Thaler filed a lawsuit in federal court seeking to overturn that determination. The lawsuit is still pending. It is currently at the summary judgment stage.

The core issue

The core issue, of course, is whether a person who uses AI to generate content such as text or artwork can claim copyright protection in the content so generated. Put another way, can a user who deploys artificial intelligence to generate a seemingly expressive work (such as artwork or a novel) claim authorship?

This question is not as simple as it may seem. There can be different levels of human involvement in the use of an AI content generating mechanism. At one extreme, there are programs like “Paint,” in which users provide a great deal of input. These kinds of programs may be analogized to paintbrushes, pens and other tools that artists traditionally have used to express their ideas on paper or canvas. Word processing programs are also in this category. It is easy to conclude that the users of these kinds of programs are the authors of works that may be sufficiently creative and original to receive copyright protection.

At the other end of the spectrum are AI services like DALL-E and ChatGPT. Text and images can be generated by these systems with minimal human input. If the only human input is a user’s directive to “Write a story” or “Draw a picture,” then it would be difficult to claim that the author contributed any creative expression. That is to say, it would be difficult to claim that the user authored anything.

Peering into the worm can

The complicating consideration with content-generative AI mechanisms is that they have the potential to allow many different levels of user involvement in the generation of output. The more details a user adds to the instructions s/he gives to the machine, the more it begins to appear that the user is, in fact, contributing something creative to the project.

Is a prompt to “Write a story about a dog” a sufficiently creative contribution to the resulting output to qualify the user as an “author”? Maybe not. But what about, “Write a story about a dog who joins a traveling circus”? Or “Write a story about a dog named Pablo who joins a traveling circus”? Or “Write a story about a dog with a peculiar bark that begins, ‘Once upon a time, there was a dog named Pablo who joined a circus,’ and ends with Pablo deciding to return home”?

At what point along the spectrum of user-provided detail does copyright protectable authorship come into existence?

A question that is just as important to ask is: How much, if at all, should the Copyright Office involve itself with ascertaining the details of the creative process that were involved in a work?

In a similar vein, should copyright registration applicants be required to disclose whether their works contain AI-generated content? Should they be required to affirmatively disclaim rights in elements of AI-generated content that are not protected by copyright?

Expanding the Rule of Doubt

Alternatively, should the U.S. Copyright Office adopt something like a Rule of Doubt when copyright is claimed in AI-generated content? The Rule of Doubt, in its current form, is the rule that the U.S. Copyright Office will accept a copyright registration of a claim containing software object code, even though the Copyright Office is unable to verify whether the object code contains copyrightable work. If effect, if the applicant attests that the code is copyrightable, then the Copyright Office will assume that it is and will register the claim. Under 37 C.F.R. § 202.20(c)(2)(vii)(B), this may be done when an applicant seeks to register a copyright in object code rather than source code. The same is true of material that is redacted to protect a trade secret.

When the Office issues a registration under the Rule of Doubt, it adds an annotation to the certificate and to the public record indicating that the copyright was registered under the Rule of Doubt.

Under the existing rule, the applicant must file a declaration stating that material for which registration is sought does, in fact, contain original authorship.

This approach allows registration but leaves it to courts (not the Copyright Office) to decide on a case-by-case basis whether material for which copyright is claimed contains copyrightable authorship.  

Expanding the Rule of Doubt to apply to material generated at least in part by AI might not be the most satisfying solution for AI users, but it is one that could result in fewer snags and delays in the registration process.

Conclusion

The Copyright Office has said that it soon will be developing registration guidance for works created in part using material generated by artificial intelligence technology. Public notices and events relating to this topic may be expected in the coming months.


Need help with a copyright matter? Contact attorney Thomas James.

Why Machine Training AI with Protected Works is Not Fair Use

… if the underlying goal of copyright’s exclusive rights and the fair use exception is to promote new “authorship,” this is doctrinally fatal to the proposal that training AIs on volumes of protected works favors a finding of fair use.

Guest blogger David Newhoff lays out the argument against the claim that training AI systems with copyright-protected works is fair use. David is the author of Who Invented Oscar Wilde? The Photograph at the Center of Modern American Copyright (Potomac Books 2020) and is a copyright advocate/writer at The Illusion of More.


As most copyright watchers already know, two lawsuits were filed at the start of the new year against AI visual works companies. In the U.S., a class-action was filed by visual artists against DeviantArt, Midjourney, and Stability AI; and in the UK, Getty Images is suing Stability AI. Both cases allege infringing use of large volumes of protected works fed into the systems to “train” the algorithms. Regardless of how these two lawsuits might unfold, I want to address the broad defense, already being argued in the blogosphere, that training generative AIs with volumes of protected works is fair use. I don’t think so.

Copyright advocates, skeptics, and even outright antagonists generally agree that the fair use exception, correctly applied, supports the broad aim of copyright law to promote more creative work. In the language of the Constitution, copyright “promotes the progress of science,” but a more accurate, modern description would be that copyright promotes new “authorship” because we do not tend to describe literature, visual arts, music, etc. as “science.”

The fair use doctrine, codified in the federal statute in 1976, originated as judge-made law, and from the seminal Folsom v. Marsh to the contemporary Andy Warhol Foundation v. Goldsmith, the courts have restated, in one way or another, their responsibility to balance the first author’s exclusive rights with a follow-on author’s interest in creating new expression. And as a matter of general principle, it is held that the public benefits from this balancing act because the result is a more diverse market of creative and cultural works.

Fair use defenses are case-by-case considerations and while there may be specific instances in which an AI purpose may be fair use, there are no blanket exceptions. More broadly, though, if the underlying goal of copyright’s exclusive rights and the fair use exception is to promote new “authorship,” this is doctrinally fatal to the proposal that training AIs on volumes of protected works favors a finding of fair use. Even if a court holds that other limiting doctrines render this activity by certain defendants to be non-infringing, a fair use defense should be rejected at summary judgment—at least for the current state of the technology, in which the schematic encompassing AI machine, AI developer, and AI user does nothing to promote new “authorship” as a matter of law.

The definition of “author” in U.S. copyright law means “human author,” and there are no exceptions to this anywhere in our history. The mere existence of a work we might describe as “creative” is not evidence of an author/owner of that work unless there is a valid nexus between a human’s vision and the resulting work fixed in a tangible medium. If you find an anonymous work of art on the street, absent further research, it has no legal author who can assert a claim of copyright in the work that would hold up in any court. And this hypothetical emphasizes the point that the legal meaning of “author” is more rigorous than the philosophical view that art without humans is oxymoronic. (Although it is plausible to find authorship in a work that combines human creativity with AI, I address that subject below.)

As a matter of law, the AI machine itself is disqualified as an “author” full stop. And although the AI owner/developer and AI user/customer are presumably both human, neither is defensibly an “author” of the expressions output by the AI. At least with the current state of technologies making headlines, nowhere in the process—from training the AI, to developing the algorithm, to entering prompts into the system—is there an essential link between those contributions and the individual expressions output by the machine. Consequently, nothing about the process of ingesting protected works to develop these systems in the first place can plausibly claim to serve the purpose of promoting new “authorship.”

But What About the Google Books Case?

Indeed. In the fair use defenses AI developers will present, we should expect to see them lean substantially on the holding in Authors Guild v. Google Books—a decision which arguably exceeds the purpose of fair use to promote new authorship. The Second Circuit, while acknowledging that it was pushing the boundaries of fair use, found the Google Books tool to be “transformative” for its novel utility in presenting snippets of books; and because that utility necessitates scanning whole books into its database, a defendant AI developer will presumably want to make the comparison. But a fair use defense applied to training AIs with volumes of protected works should fail, even under the highly utilitarian holding in Google Books.

While people of good intent can debate the legal merits of that decision, the utility of the Google Books search engine does broadly serve the interest of new authorship with a useful research tool—one I have used many times myself. Google Books provides a new means by which one author may research the works of another author, and this is immediately distinguishable from the generative AI which may be trained to “write books” without authors. Thus, not only does the generative AI fail to promote authorship of the individual works output by the system, but it fails to promote authorship in general.

Although the technology is primitive for the moment, these AIs are expected to “learn” exponentially and grow in complexity such that AIs will presumably compete with or replace at least some human creators in various fields and disciplines. Thus, an enterprise which proposes to diminish the number of working authors, whether intentionally or unintentionally, should only be viewed as devastating to the purpose of copyright law, including the fair use exception.

AI proponents may argue that “democratizing” creativity (i.e., putting these tools in every hand) promotes authorship by making everyone an author. But aside from the cultural vacuum this illusion of more would create, the user prompting the AI has a high burden to prove authorship, and it would really depend on what he is contributing relative to the AI. As mentioned above, some AIs may evolve as tools such that the human in some way “collaborates” with the machine to produce a work of authorship. But this hypothetical points to the reason why fair use is a fact-specific, case-by-case consideration. AI Alpha, which autonomously creates, or creates mostly without human direction, should not benefit from the potential fair use defense of AI Beta, which produces a tool designed to aid, but not replace, human creativity.

Broadly Transformative? Don’t Even Go There

Returning to the constitutional purpose of copyright law to “promote science,” the argument has already been floated as a talking point that training AI systems with protected works promotes computer science in general and is, therefore, “transformative” under fair use factor one for this reason. But this argument should find no purchase in court. To the extent that one of these neural networks might eventually spawn revolutionary utility in medicine or finance etc., it would be unsuitable to ask a court to hold that such voyages of general discovery fit the purpose of copyright, to say nothing of the likelihood that the adventure strays inevitably into patent law. Even the most elastic fair use findings to date reject such a broad defense.

It may be shown that no work(s) output by a particular AI infringes (copies) any of the works that went into its training. It may also be determined that the corpus of works fed into an AI is so rapidly atomized into data that even fleeting “reproduction” is found not to exist, and, thus, the 106(1) right is not infringed. Those questions are going to be raised in court before long, and we shall see where they lead. But to presume fair use as a broad defense for AI “training” is existentially offensive to the purpose of copyright, and perhaps to law in general, because it asks the courts to vest rights in non-humans, which is itself anathema to caselaw in other areas.[1]

It is my oft-stated opinion that creative expression without humans is meaningless as a cultural enterprise, but it is a matter of law to say that copyright is meaningless without “authors” and that there is no such thing as non-human “authors.” For this reason, the argument that training AIs on protected works is inherently fair use should be denied with prejudice.


[1] Cetaceans v. Bush holding that animals do not have standing in court was the basis for rejecting PETA’S complaint against photographer Slater for infringing the copyright rights of the monkey in the “Monkey Selfie” fiasco.


%d bloggers like this: