Let’s Stop Analogizing Human Creators to Machines

Of course, policy discussions usually begin with the existing framework, but in this instance, it can be a shaky starting place because generative AI presents some unique challenges—and not just for the practice of copyright law.

[Guest post by David Newhoff, author of The Illusion of More and Who Invented Oscar Wilde? The Photograph at the Center of Modern American Copyright.]

Just as it is folly to anthropomorphize computers and robots, it is also unhelpful to discuss the implications of generative AI in copyright law by analogizing machines to authors.[1] In 2019, I explored the idea that “machine learning” could be analogous to human reading if the human happens to have an eidetic memory. But this was a thought exercise, and in that post, I also imagined machine training that serves a computer science or research purpose—not necessarily generative AIs trained on protected works designed to produce works without authors.

In the present discussion, however, certain parties weighing in on AI and copyright seem to advocate policy that is premised on the language and principles of existing doctrine as applicable to the technological processes of both the input and output sides of the generative AI equation. Of course, policy discussions usually begin with the existing framework, but in this instance, it can be a shaky starting place because generative AI presents some unique challenges—and not just for the practice of copyright law.

We should be wary of analogizing machine functions to human activity for the simple reason that copyright law (indeed all law) has never been anything but anthropocentric. Although it is difficult to avoid speaking in terms of machines “learning” or “creating,” it is essential that we either constantly remind ourselves that these are weak, inaccurate metaphors, or that a new glossary is needed to describe what certain AIs may be doing in the world of creative production.

On the input (training) side of the equation, the moment someone says something like, “Humans learn to make art by looking at art, and generative AIs do the same thing,” the speaker should be directed to the break-out session on sci-fi and excused from any serious conversation about applicable copyright law. Likewise, on the output side, comparisons of AI to other technological developments—from the printing press to Photoshop—should be presumed irrelevant unless the AI at issue can plausibly be described as a tool of the author rather than the primary maker of a work of creative expression.

Copyright Office Guidance Highlights Some Key Difficulties

To emphasize the exceptional nature of this discussion, even experts are somewhat confused by both the doctrinal and administrative aspects in the new guidelines published by U.S. Copyright Office directing authors how to disclaim AI-generated material in a registration application. The confusion is hardly surprising because generative AI has prompted the Office to ask an unprecedented question—namely, How was this work made?

As noted in several posts, copyrightability has always been agnostic with regard to the creative process. Copyright rights attach to works that show a modicum of originality, and the Copyright Office does not generally ask what tools, methods, etc. the author used to make a work.[2] But this historic practice was then confronted by the now widely reported applications submitted by Stephen Thaler and Kris Kashtanova, both claiming copyright in visual works made with generative AI.

In both cases, the Copyright Office rejected registration applications for the visual works based on the longstanding, bright-line doctrine that copyright rights can only attach to works made by human beings. In Thaler’s case, the consideration is straightforward because the claimant affirmed that the image was produced entirely by a machine. Kashtanova, on the other hand, asserts more than de minimis authorship (i.e., using AI as a tool) to produce the visual works elements in a comic book.

Whether in response to Kashtanova—or certainly anticipating applications yet to come—the muddiness of the Office guidelines is an attempt to address the difficult question as to whether copyright attaches to a work that combines authorship and AI generation, and how to draw distinctions between the two. This is not only new territory for the Office as a doctrinal matter but is a potential mess as an administrative one.

The Copyright Office has never been tasked with separating the protectable expression attributable to a human from the unprotectable expression attributable to a machine. Even if it could be said that photography has always provoked this tension (a discussion on its own), the analysis has never been an issue for the Office when registering works, but only for the courts in resolving claims of infringement. In fact, Warhol v. Goldsmith, although a fair use case, is a prime example of how tricky it can be to separate the factual elements of a photograph from the expressive elements.

But now the Copyright Office is potentially tasked with a copyrightability question that, in practice, would ask both the author and the examiner to engage in a version of the idea/expression dichotomy analysis—first separating the machine generated material from the author’s material and then considering whether the author has a valid claim in the protectable expression.

This is not so easy to accomplish in a work that combines author and machine-made elements in a manner that may be subtly intertwined; it begs new questions about what the AI “contributed” to a given work; and the inquiry is further complicated by the variety of AI tools in the market or in development. Then, because neither the author/claimant nor the Office examiner is likely a copyright attorney (let alone a court), the inquiry is fraught with difficulty as an administrative process—and that’s if the author makes a good-faith effort to disclaim the AI-generated material in the first place.

Many independent authors are confused enough by the Limit of Claim in a registration application or the concept of “published” versus “unpublished.” Asking these same creators to delve into the metaphysics implied by the AI/Author distinction seems like a dubious enterprise, and one that is not likely to foster more faith in the copyright system than the average indie creator has right now.

Copyrightability Could Remain Blind But …

It is understandable that some creators (e.g., filmmakers using certain plug-ins) may be concerned that the Copyright Office has already taken too broad a view—connoting a per se rule that denies copyrightability for any work generated with any AI technology. This concern is a reminder that AI should not be discussed as a monolithic topic because not all AI enhanced products do the same thing. And again, this may imply a need for some new terms rather than the words we use to describe human activities.

In this light, one could follow a different line of reasoning and argue that the agnosticism of copyrightability vis-à-vis process has always implied a presumption of human authorship where other factors—from technological enhancements to dumb luck—invisibly contribute to the protectable expression. Relatedly, a photographer can add a filter or plug-in that changes the expressive qualities of her image, but doing so is considered part of the selection and arrangement aspect of her authorship and does not dilute the copyrightability of the image.

Some extraordinary visual work has already been produced by professional artists using AI to yield results that are too strikingly well-crafted to believe that the author has not exerted considerable influence over the final image. In this regard, then, perhaps the copyrightability question at the registration stage, no matter how sophisticated the “filter” becomes, should remain blind to process. The Copyright Office could continue to register works submitted by valid claimants without asking the novel How question.

But the more that works may be generated with little or no human spark, the more this agnostic, status-quo approach could unravel the foundation of copyright rights altogether. And it would not be the first time that major tech companies have sought to do exactly that. It is no surprise that an AI developer or a producer using AI would seek the financial benefits of copyright protection; but without a defensible presence of human expression in the work, the exclusive rights of copyright cannot vest in a person with the standing to defend those rights. Nowhere in U.S. law do non-humans have rights of any kind, and this foundational principle reminds us that although machine activity can be compared to human activity as an allegorical construct, this is too whimsical for a serious policy discussion.

Again, I highlight this tangle of administrative and doctrinal factors to emphasize the point that generative AI does not merely present new variations on old questions (e.g., photography), but raises novel questions that cannot easily be answered by analogies to the past. If the challenges presented by generative AI are to be resolved sensibly, and in a way that will serve independent creators, policymakers and thought leaders on copyright law should be skeptical of arguments that too earnestly attempt to transpose centuries of doctrine for human activity into principles applied to machine activity.


[1] I do not distinguish “human” authors, because there is no other kind.

[2] I say “generally” only because I cannot account for every conversation among claimants and examiners.

Why Machine Training AI with Protected Works is Not Fair Use

… if the underlying goal of copyright’s exclusive rights and the fair use exception is to promote new “authorship,” this is doctrinally fatal to the proposal that training AIs on volumes of protected works favors a finding of fair use.

Guest blogger David Newhoff lays out the argument against the claim that training AI systems with copyright-protected works is fair use. David is the author of Who Invented Oscar Wilde? The Photograph at the Center of Modern American Copyright (Potomac Books 2020) and is a copyright advocate/writer at The Illusion of More.


As most copyright watchers already know, two lawsuits were filed at the start of the new year against AI visual works companies. In the U.S., a class-action was filed by visual artists against DeviantArt, Midjourney, and Stability AI; and in the UK, Getty Images is suing Stability AI. Both cases allege infringing use of large volumes of protected works fed into the systems to “train” the algorithms. Regardless of how these two lawsuits might unfold, I want to address the broad defense, already being argued in the blogosphere, that training generative AIs with volumes of protected works is fair use. I don’t think so.

Copyright advocates, skeptics, and even outright antagonists generally agree that the fair use exception, correctly applied, supports the broad aim of copyright law to promote more creative work. In the language of the Constitution, copyright “promotes the progress of science,” but a more accurate, modern description would be that copyright promotes new “authorship” because we do not tend to describe literature, visual arts, music, etc. as “science.”

The fair use doctrine, codified in the federal statute in 1976, originated as judge-made law, and from the seminal Folsom v. Marsh to the contemporary Andy Warhol Foundation v. Goldsmith, the courts have restated, in one way or another, their responsibility to balance the first author’s exclusive rights with a follow-on author’s interest in creating new expression. And as a matter of general principle, it is held that the public benefits from this balancing act because the result is a more diverse market of creative and cultural works.

Fair use defenses are case-by-case considerations and while there may be specific instances in which an AI purpose may be fair use, there are no blanket exceptions. More broadly, though, if the underlying goal of copyright’s exclusive rights and the fair use exception is to promote new “authorship,” this is doctrinally fatal to the proposal that training AIs on volumes of protected works favors a finding of fair use. Even if a court holds that other limiting doctrines render this activity by certain defendants to be non-infringing, a fair use defense should be rejected at summary judgment—at least for the current state of the technology, in which the schematic encompassing AI machine, AI developer, and AI user does nothing to promote new “authorship” as a matter of law.

The definition of “author” in U.S. copyright law means “human author,” and there are no exceptions to this anywhere in our history. The mere existence of a work we might describe as “creative” is not evidence of an author/owner of that work unless there is a valid nexus between a human’s vision and the resulting work fixed in a tangible medium. If you find an anonymous work of art on the street, absent further research, it has no legal author who can assert a claim of copyright in the work that would hold up in any court. And this hypothetical emphasizes the point that the legal meaning of “author” is more rigorous than the philosophical view that art without humans is oxymoronic. (Although it is plausible to find authorship in a work that combines human creativity with AI, I address that subject below.)

As a matter of law, the AI machine itself is disqualified as an “author” full stop. And although the AI owner/developer and AI user/customer are presumably both human, neither is defensibly an “author” of the expressions output by the AI. At least with the current state of technologies making headlines, nowhere in the process—from training the AI, to developing the algorithm, to entering prompts into the system—is there an essential link between those contributions and the individual expressions output by the machine. Consequently, nothing about the process of ingesting protected works to develop these systems in the first place can plausibly claim to serve the purpose of promoting new “authorship.”

But What About the Google Books Case?

Indeed. In the fair use defenses AI developers will present, we should expect to see them lean substantially on the holding in Authors Guild v. Google Books—a decision which arguably exceeds the purpose of fair use to promote new authorship. The Second Circuit, while acknowledging that it was pushing the boundaries of fair use, found the Google Books tool to be “transformative” for its novel utility in presenting snippets of books; and because that utility necessitates scanning whole books into its database, a defendant AI developer will presumably want to make the comparison. But a fair use defense applied to training AIs with volumes of protected works should fail, even under the highly utilitarian holding in Google Books.

While people of good intent can debate the legal merits of that decision, the utility of the Google Books search engine does broadly serve the interest of new authorship with a useful research tool—one I have used many times myself. Google Books provides a new means by which one author may research the works of another author, and this is immediately distinguishable from the generative AI which may be trained to “write books” without authors. Thus, not only does the generative AI fail to promote authorship of the individual works output by the system, but it fails to promote authorship in general.

Although the technology is primitive for the moment, these AIs are expected to “learn” exponentially and grow in complexity such that AIs will presumably compete with or replace at least some human creators in various fields and disciplines. Thus, an enterprise which proposes to diminish the number of working authors, whether intentionally or unintentionally, should only be viewed as devastating to the purpose of copyright law, including the fair use exception.

AI proponents may argue that “democratizing” creativity (i.e., putting these tools in every hand) promotes authorship by making everyone an author. But aside from the cultural vacuum this illusion of more would create, the user prompting the AI has a high burden to prove authorship, and it would really depend on what he is contributing relative to the AI. As mentioned above, some AIs may evolve as tools such that the human in some way “collaborates” with the machine to produce a work of authorship. But this hypothetical points to the reason why fair use is a fact-specific, case-by-case consideration. AI Alpha, which autonomously creates, or creates mostly without human direction, should not benefit from the potential fair use defense of AI Beta, which produces a tool designed to aid, but not replace, human creativity.

Broadly Transformative? Don’t Even Go There

Returning to the constitutional purpose of copyright law to “promote science,” the argument has already been floated as a talking point that training AI systems with protected works promotes computer science in general and is, therefore, “transformative” under fair use factor one for this reason. But this argument should find no purchase in court. To the extent that one of these neural networks might eventually spawn revolutionary utility in medicine or finance etc., it would be unsuitable to ask a court to hold that such voyages of general discovery fit the purpose of copyright, to say nothing of the likelihood that the adventure strays inevitably into patent law. Even the most elastic fair use findings to date reject such a broad defense.

It may be shown that no work(s) output by a particular AI infringes (copies) any of the works that went into its training. It may also be determined that the corpus of works fed into an AI is so rapidly atomized into data that even fleeting “reproduction” is found not to exist, and, thus, the 106(1) right is not infringed. Those questions are going to be raised in court before long, and we shall see where they lead. But to presume fair use as a broad defense for AI “training” is existentially offensive to the purpose of copyright, and perhaps to law in general, because it asks the courts to vest rights in non-humans, which is itself anathema to caselaw in other areas.[1]

It is my oft-stated opinion that creative expression without humans is meaningless as a cultural enterprise, but it is a matter of law to say that copyright is meaningless without “authors” and that there is no such thing as non-human “authors.” For this reason, the argument that training AIs on protected works is inherently fair use should be denied with prejudice.


[1] Cetaceans v. Bush holding that animals do not have standing in court was the basis for rejecting PETA’S complaint against photographer Slater for infringing the copyright rights of the monkey in the “Monkey Selfie” fiasco.


A Thousand Cuts: AI and Self-Destruction

David Newhoff comments on generative AI (artificial intelligence) and public policy.

A guest post written by David Newhoff. AI, of course, stands for “artificial intelligence.” David is the author of Who Invented Oscar Wilde? The Photograph at the Center of Modern American Copyright (Potomac Books 2020) and a copyright advocate/writer at The Illusion of More.


I woke up the other day thinking about artificial intelligence (AI) in context to the Cold War and the nuclear arms race, and curiously enough, the next two articles I read about AI made arms race references. Where my pre-caffeinated mind had gone was back to the early 1980s when, as teenagers, we often asked that futile question as to why any nation needed to stockpile nuclear weapons in quantities that could destroy the world many times over.

Every generation of adolescents believes—and at times confirms—that the adults have no idea what the hell they’re doing; and watching the MADness of what often seemed like a rapturous embrace of nuclear annihilation was, perhaps, the unifying existential threat which shaped our generation’s world view. Since then, reasonable arguments have been made that nuclear stalemate has yielded an unprecedented period of relative global peace, but the underlying question remains:  Are we powerless to stop the development of new modes of self-destruction?

Of course, push-button extinction is easy to imagine and, in a way, easy to ignore. If something were to go terribly wrong, and the missiles fly, it’s game over in a matter of minutes with no timeouts left. So, it is possible to “stop worrying” if not quite “love the bomb” (h/t Strangelove); but today’s technological threats preface outcomes that are less merciful than swift obliteration. Instead, they offer a slow and seemingly inexorable decline toward the dystopias of science fiction—a future in which we are not wiped out in a flash but instead “amused to death” (h/t Postman) as we relinquish humanity itself to the exigencies of technologies that serve little or no purpose.

The first essay I read about AI, written by Anja Kaspersen and Wendell Wallach for the Carnegie Council, advocates a “reset” in ethical thinking about AI, arguing that giant technology investments are once again building systems with little consideration for their potential effect on people. “In the current AI discourse we perceive a widespread failure to appreciate why it is so important to champion human dignity. There is risk of creating a world in which meaning and value are stripped from human life,” the authors write. Later, they quote Robert Oppenheimer …

It is not possible to be a scientist unless you believe that the knowledge of the world, and the power which this gives, is a thing which is of intrinsic value to humanity, and that you are using it to help in the spread of knowledge, and are willing to take the consequences.

I have argued repeatedly that generative AI “art” is devoid of meaning and value and that the question posed by these technologies is not merely how they might influence copyright law, but whether they should exist at all. It may seem farfetched to contemplate banning or regulating the development of AI tech, but it should not be viewed as an outlandish proposal. If certain AI developments have the capacity to dramatically alter human existence—perhaps even erode what it means to be human—why is this any less a subject of public policy than regulating a nuclear power plant or food safety?

Of course, public policy means legislators, and it is quixotic to believe that any Congress, let alone the current one, could sensibly address AI before the industry causes havoc. At best, the tech would flood the market long before the most sincere, bipartisan efforts of lawmakers could grasp the issues; and at worst, far too many politicians have shown that they would sooner exploit these technologies for their own gain than they would seek to regulate it in the public interest. “AI applications are increasingly being developed to track and manipulate humans, whether for commercial, political, or military purposes, by all means available—including deception,” write Kaspersen and Wallach. I think it’s fair to read that as Cambridge Analytica 2.0 and to recognize that the parties who used the Beta version are still around—and many have offices on Capitol Hill.

Kaspersen and Wallach predict that we may soon discover that generative AI will have the same effect on education that “social media has had on truth.” In response, I would ask the following: In the seven years since the destructive power of social media became headline news, have those revelations significantly changed the conversation, let alone muted the cyber-libertarian dogma of the platform owners? I suspect that AI in the classroom threatens to exacerbate rather than parallel the damage done by social media to truth (i.e., reason). If social media has dulled Socratic skills with the flavors of narcissism, ChatGPT promises a future that does not remember what Socratic skills used to mean.

And that brings me to the next article I read in which Chris Gillard and Pete Rorabaugh, writing for Slate, use “arms race” as a metaphor to criticize technological responses to the prospect of students cheating with AI systems like ChatGPT. Their article begins:

In the classroom of the future—if there still are any—it’s easy to imagine the endpoint of an arms race: an artificial intelligence that generates the day’s lessons and prompts, a student-deployed A.I. that will surreptitiously do the assignment, and finally, a third-party A.I. that will determine if any of the pupils actually did the work with their own fingers and brain. Loop complete; no humans needed. If you were to take all the hype about ChatGPT at face value, this might feel inevitable. It’s not.

In what I feared might be another tech-apologist piece labeling concern about AI a “moral panic,” Gillard and Rorabaugh make the opposite point. Their criticism of software solutions to mitigate student cheating is that it is small thinking which erroneously accepts as a fait accompli that these AI systems are here to stay whether we like it or not. “Telling us that resistance to a particular technology is futile is a favorite talking point for technologists who release systems with few if any guardrails out into the world and then put the onus on society to address most of the problems that arise,” they write.

In other words, here we go again. The ethical, and perhaps legal, challenges posed by AI are an extension of the same conversation we generally failed to have about social media and its cheery promises to be an engine of democracy. “It’s a failure of imagination to think that we must learn to live with an A.I. writing tool just because it was built,” Gillard and Rorabaugh argue. I would like to agree but am skeptical that the imagination required to reject certain technologies exists outside the rooms where ethicists gather. And this is why I wake up thinking about AI in context to the Cold War, except of course that the doctrine of Mutually Assured Destruction was rational by contrast.


Photo by the author.

View the original article on The Illusion of More.

Contact attorney Tom James for copyright help

Need help registering a copyright or a group of copyrights in the United States, or enforcing a copyright in the United States? Contact attorney Tom James.

%d bloggers like this: