A mixed decision from the US Copyright Office

We received the decision today relative to Kristina Kashtanova's case about the comic book Zarya of the Dawn. Kris will keep the copyright registration, but it will be limited to the text and the whole work as a compilation.

In one sense this is a success, in that the registration is still valid and active. However, it is the most limited a copyright registration can be and it doesn't resolve the core questions about copyright in AI-assisted works. Those works may be copyrightable, but the USCO did not find them so in this case.

Nevertheless I am surprised at this result. In the USCO's recent filing in the Thaler case, the Office said that it was "preparing registration guidance for works generated by using AI." The question is whether the images Kris generated using Midjourney are copyrightable. If the Office is preparing guidance to allow registration of AI-assisted works, that strongly suggests that the USCO believes there is some threshold of human involvement that is sufficient to allow registration. The Office recognizes that Kris had input into the images that were created for Zarya of the Dawn, but it just doesn't seem to feel that the human input is sufficient.

The crux of the USCO's argument appears to be that the author has limited control over what picture is generated using Midjourney (or a similar tool). They recognize that the author has some control over what comes out of Midjourney but not enough: "the process is not controlled by the user because it is not possible to predict what Midjourney will create ahead of time" (p. 8) or "Rather than a tool that Ms. Kashtanova controlled and guided to reach her desired image, Midjourney generates images in an unpredictable way." (p. 9)

There are a number of errors with the Office's arguments, some legal and some factual. However, they all seem to stem from a core factual misunderstanding of the role that randomness plays in Midjourney's image generation.

The Office seems to think that the outputs from Midjourney are almost totally "random" and "unpredictable," so whatever the artist puts into Midjourney just doesn't matter. At most it's a "suggestion" that can be ignored.

First, that's not the right legal standard. The standard is whether there is a "modicum of creativity," not whether Kris could "predict what Midjourney [would] create ahead of time." In other words, the Office incorrectly focused on the output of the tool rather than the input from the human.

Jackson Pollack famously couldn't predict how the paint he used would drip onto the canvas. Pollack designed his paintings - he knew what he wanted the end result to be - but he used a process involving random dripping and flicking of paint to make his art. In music, each performance of John Cage's 4'33" is entirely defined by the random sounds that are made by the audience and the world around the stage.

In photography, the closest analogous art, there are famous photographs that captured animals, people, or humorous situations entirely by mistake. Nevertheless, the output is still copyrightable, because a human had at least a "modicum" of input into the shot.

When examined from the correct legal standard - did Kris provide a "modicum" of input - the Office's answer seems inconsistent. The Office recognizes that Kris personally authored the prompts and other inputs. It just doesn't think Kris did enough. But a "modicum" is the merest sliver of originality and creativity - "a very modest quantum of originality will suffice" (1 Nimmer on Copyright § 2.08). Courts have found that almost any decision that goes into a photograph will do - even decisions like which camera to use, choosing a brand of film, and taking several shots and picking one. Kris made exactly analogous decisions and had the additional element of the personally composed prompts instead of just a simple "button press" as would occur on a camera.

The second error is encapsulated in the Office's statement that as "Midjourney’s specific output cannot be predicted by users makes Midjourney different for copyright purposes than other tools used by artists." (p.9) The problem is that if this statement is construed narrowly enough to make it true, then it promulgates an incorrect standard. If this statement is construed more broadly, then it is factually false.

If the Office's statement is interpreted narrowly, it is true that the exact output is not predictable. However, having 100% control over the output is also not the correct standard. The standard is a modicum of creative input leading to the output.

If the Office's statement is interpreted broadly, it is factually false. Basic experimentation will show that when you type "cute purple dinosaur" into Midjourney, you get back images of a cute purple dinosaur, not a motorcycle or a cloud. Further, the more inputs given by the artist, the more control is exerted over the output. Again, not 100% control - but far more than the Office seems to understand.

Third, the Office seems to think we were making a "sweat of the brow"-type argument when we pointed out that generation of the pictures in Zarya of the Dawn took time. The Office apparently thought that Kris just hit the "generate" button a thousand times, hoping that the right picture would come up. If generation in Midjourney was random in the way the Office thinks, that might have been so. But the reality is much different.

Kris' creation of the images was selective and iterative. Kris would prompt for the generation of a first set of images. Kris would then select one of the first generation of images, and use that first-generation image and a tweaked prompt to create a second generation of images. Kris would then use a selected image from the second generation as part of the input to generate a third generation, and so on. The images in Zarya of the Dawn were evolved under close artistic direction. There may have been randomness inherent in the tool, but the images were nevertheless designed over many iterations to have a specific subject, lighting, content, layout, and feel. It should not matter that the subject, lighting, content, and layout were generated instead of captured.

Fourth, there seems to be a subtle factual error associated with how latent diffusion models work. The Office quotes from Midjourney's documentation describing diffusion models generally, and based on that reading, comes to the conclusion that "Because Midjourney starts with randomly generated noise that evolves into a final image, there is no guarantee that a particular prompt will generate any particular visual output." The Office compares the prompt to a mere"suggestion" that may or may not be followed by the tool. "

The subtle error comes in a misunderstanding about the "randomly generated noise." Midjourney technically has two layers: a semantic "latent" layer associated with meaning and a visual "pixel" layer associated with images. When a person inputs a prompt into Midjourney, the effect is to focus the attention of the tool on one or more specific spots in the latent domain, places that are statistically associated with particular meanings. The visual layer evolves the final image from the noise based upon the latent "meaning" in the prompt.

This is how the artist controls the outputs that come from Midjourney. The control is only approximate, not perfect. The widespread practice of "prompt engineering" is actually an exploration through latent space - a probablistic landscape of ideas and meanings - to match the expression to the artist's conception. The goal of the artist is to develop the exact set of inputs - images, words, and options - that will lead to the generation of the desired output.

So when the Office compares the prompt to a "suggestion" like a patron might give to a painter, it is anthropomorphizing the tool and coming to an invalid conclusion. Midjourney can't take "suggestions." It can only do exactly as it is programmed to do and pull from an artist-chosen place in its massive table of probabilities to drive the generation of an image.

I have come to the conclusion that that almost every work created by an AI tool should be copyrightable, even without the iterative refinement and post-processing that Kris performed. The more I search, the more I see similarities with photography and the long copyright battles over what minimum amount of creativity is needed to support the copyright in a photograph.

Photographs are difficult doctrinally for copyright because of the high amount of tool involvement. The USCO repeatedly quoted Burrow-Giles in its response, a case that turned on the creativity and artistry that the photographer showed in posing the subject. However, Burrow-Giles was not the last word. The Office didn't even quote Bleistein, which recognized that even prosaic photos had enough human input to be copyrightable.

The standard for copyrightability in photographs has been discussed several times, but the most common cite is from Judge Learned Hand in Jewelers' Circular Pub. Co. v. Keystone Pub. Co., 274 F. 932 (S.D.N.Y. 1921). He stated: "no photograph, however simple, can be unaffected by the personal influence of the author" (id. at 34). In reviewing all the cases about photographic copyright, Nimmer on Copyright concluded that this "has become the prevailing view," and therefore "almost any[] photograph may claim the necessary originality to support a copyright merely by virtue of the photographer's personal choice of subject matter, angle of photograph, lighting, and determination of the precise time when the photograph is to be taken." 1 MELVIN B. NIMMER & DAVID NIMMER, NIMMER ON COPYRIGHT § 2.08[E][1], at 2-130 (2d ed. 1999).

AI-assisted art is going to need to be treated like photography. It is just a matter of time.

By VanL

The Cryptographic Autonomy License

I want to introduce the Cryptographic Autonomy License. (PDF Link) (NB: The CAL is currently at version 1.0-Beta. As the CAL is still open for revision, this link will be periodically updated to reflect the current "canonical" version of the license.) The Cryptographic Autonomy License, or CAL, is a new strong "network" copyleft license especially appropriate for distributed systems. The CAL has not yet been certified as open source. It will be submitted to the Open Source Initiative for approval after the conclusion of the public comment period.

Warning: This post is long and assumes a high level of familiarity with the intricacies around open source law. If that doesn't sound exciting, this may not be the post for you. ☺

Why a new license?

I have become increasingly interested in reciprocal licenses. I'm not the only one. There have efforts to create new strong copyleft licenses, such as the Parity Public License and the Server Side Public License. Nevertheless, none of these new proposed licenses has as yet been accepted as open source by the open source initiative.

Clearly there is a desire for stronger reciprocal licenses. And while I personally tend to use more permissive licenses, copyleft has a place in helping preserve software freedom, which I consider important.

Thus I was excited when Holo contacted me and described their need for a strong network copyleft license for Holochain, and engaged me to work with them drafting it.

Key Principles

When considering the structure of what eventually became the CAL, there were a couple of key principles we used when drafting.[1]

  1. The CAL must comply with the Open Source Definition—including in spirit.

The Open Source Definition (OSD) is valuable from both a legal and community-building perspective. Legally, OSD-compliant licenses have clear boundaries that encourage understanding between licensors and licensees.

But open source licenses are even more important as community-building tools. As I have written before:

Open source licenses are always about communities. You pick a community by picking a license. The license isn't just a legal document—it ends up being part of a social compact between participants in the community.

This community aspect is the most important aspect of the CAL. The CAL is agnostic as to particular business models. It meets Holo's needs, but it is not specific to Holo. It has no special rules for original contributors. It can be used for inbound == outbound licensing. It may not be for everyone, but it is fair.

  1. The CAL should be primarily structured as a license.

As much as possible, the effects of the license are expressed in terms of either a grant of permissions or a applicable-to-all-parties condition on the grants.

Not every part of the license can be structured this way. But anything not structured as a grant or a condition should be able to be struck from the license by a court without affecting enforceability.

Further, grants and conditions can be automatically applied. There is no need for offer and acceptance. Just in case, however, the CAL also has language expressing the terms as a unilateral offer with acceptance by action. But that is a hopefully-redundant measure.[2]

  1. The CAL should make use of all available reserved rights.

Most open source licenses primarily focus on the reserved rights under copyright. The CAL also addresses the reserved rights under patent law.[3]

This is because copyright rights automatically vest in newly-written software. But software can also be patentable. The OSD also implicates patent rights. But even licenses that include a patent grant do not generally condition making, using, and selling the work on granting the same permissions to others. The CAL does.

It is true that provisions that rely on patent-exclusive rights require that the licensor register for and be granted a patent in order to be enforce those provisions. However, this requirement is not substantially different from enforcement under copyright law. The Supreme Court recently decided in Fourth Estate Public Benefit Corp. v. Wall-Street.com LLC[4] that registration is required before an enforcer can sue for copyright infringement as well. Just like a patent application, an application for copyright registration undergoes an examination process to determine if the work is eligible for protection. A registration certificate issues if the work meets the requirements. Thus, both copyright and patent law require registration and examination before enforcement.

Even in a situation in which a licensor does not have a patent, it is ordinary practice to restrict software functionality on the basis of copyright rights. The functional and expressive aspects of software are so interrelated that copying expression is very hard to avoid using the "patent" verbs (e.g., make, use, sell, and offer to sell) when exercising rights under copyright. Thus, the use of patent verbs makes the application and of the license consistent regardless of whether the licensor has a patent.

Second, even when considering just copyright, most open source licenses don't address the full spectrum of reserved rights. In particular, the copyright grants authors the exclusive rights to publicly perform and display works. "Public performance" has not been well-defined with regard to software, but it is a recognized right under copyright that appears to apply. The CAL provides a definition of public performance for software and uses it to trigger copyleft provisions.

Breaking Down the CAL

The CAL is designed to be read from top to bottom, but it includes a number of interrelated concepts that appear in many sections. Defined terms are Capitalized. Even though the Definitions are near the bottom (in Section 6), the terms were chosen so that they would read intelligibly in context. In the description below, regular links in each paragraph are to sections in this document; italicized links with arrows (⭷) are to the associated part of the CAL. Reading from the top down:

What is the “Work”?

Watching the debate over the SSPL heightened my concern regarding the interaction of copyleft-style requirements as they relate to OSD element 9, that the "License Must Not Restrict Other Software." I am also sensitized to the possibility of misuse-based defenses.[5] Misuse defenses and OSD 9 compliance issues share a common theme: using one work to control the licensing of another.

One way to address these concerns is by focusing on IP permissions and conditions as discussed above. The conditions in the CAL only discuss the "Work" itself, and the exercise of exclusive rights relative to the Work. Obeying this rule sidesteps the misuse and OSD-related concerns.

So what is the "Work"? Somewhat circularly, the CAL defines the Work ⭷ as anything a licensor may use IP rights to protect. That ties the license to the full scope of copyright and patent law. It also helpfully excludes the "restrict[ion of] other software" as required by OSD 9. Thus the CAL protects the Work as fully as possible under IP law, but no further. The CAL also includes an interpretive clause (CAL § 7.1.2 ⭷)) making that clear.

This definition of the "Work" has a number of nuances. It is strictly more expansive than the normal copyright-based definition of works, because it also includes patentable functionality. Technically, the functional elements of software are not (or should not be) subject to copyright protection. They are just "along for the ride," so to speak. But the CAL's definition of the Work extends to all protectable elements of the work, including those that may be protectable solely under patent (or database) protection laws. In fact, this definition results in "maximal copyleft" as described by McCoy Smith in his CopyleftConf 2019 presentation: If some aspect of the work can be subject to the CAL, then it is.

A second implication is that the Work includes APIs and interfaces. This is for two reasons. First, there are many patents that directly reference or claim APIs. More controversially, the core issue in Oracle v. Google is the protection of the Java APIs through copyright. Regardless of a person's policy preferences, APIs are logically covered under patent law, copyright law, or both. Thus the API is part of the Work, so conditions on the use or reimplementation of the API are consistent with only applying the CAL to the Work itself.

Copyleft and the Public Performance of Software

A unique element in the CAL (and in software licensing generally) is the use of the reserved rights of "public performance" and "public display" as licensing hooks. This is the core of the "network" aspect of the CAL, so it requires further explanation.

First, the statutory authority for using public performance and public display comes from 17 U.S.C. § 106:

Subject to sections 107 through 122, the owner of copyright under this title has the exclusive rights to do and to authorize any of the following:

    1. to reproduce the copyrighted work in copies or phonorecords;

    2. to prepare derivative works based upon the copyrighted work;

    3. to distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership, or by rental, lease, or lending;

    4. in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works, to perform the copyrighted work publicly;

    5. in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work, to display the copyrighted work publicly; and

    6. in the case of sound recordings, to perform the copyrighted work publicly by means of a digital audio transmission.*

(Bold emphasis added.) Software is defined as a literary work, so the rights apply as by statute. But what exactly is "public performance"? It is not distribution or reproduction, because it doesn't result in a new fixation of the work. Instead, it is the making of a work perceptible by the public that results in public display or performance. As per 17 U.S.C. § 101:

To perform or display a work “publicly” means—

[. . .]

(2) to transmit or otherwise communicate a performance or display of the work to a place specified by clause (1) or to the public, by means of any device or process, whether the members of the public capable of receiving the performance or display receive it in the same place or in separate places and at the same time or at different times.

As the World Wide Web was just taking off, President Clinton convened a group to discuss issues related to the United States' "National Information Infrastructure."[6] One of the topics discussed was the application of network technologies to intellectual property, including the public performance right. As Bruce Lehman wrote in the intellectual property task force report:

A distinction must be made between transmissions of copies of works and transmissions of performances or displays of works. When a copy of a work is transmitted over wires, fiber optics, satellite signals or other modes in digital form so that it may be captured in a user’s computer without the capability of simultaneous “rendering” or “showing,” it has rather clearly not been performed. Thus, for example, a file comprising the digitized version of a motion picture might be transferred from a copyright owner to an end user via the Internet without the public performance right being implicated. When, however, the motion picture is “rendered”—by showing its images in sequence—so that users with the requisite hardware and software might watch it with or without copying the performance, then, under the current law, a “performance” has occurred.[7]

In the context of network-capable software, especially distributed systems, certain elements of the interaction over a network would appear to conform with this definition. For example, a database that makes its API available over the network would apply. The software itself is part of a "performable" literary work, and the API is part of the Work. The element of the API are transmitted to various people at various times (including possibly to the "public"), but they are not used to recreate a "copy" of the work. Rather, they are used to make the work "perceptible"—usable—from a distance.

Thus, the right of public performance appears to apply to software. But because this is a novel legal theory—for software at least—I added a definition of "Public Performance" designed to capture this concept in the CAL directly. The definition is at CAL § 6(m) ⭷:

“Public Performance” (or “Publicly Performing”) means using the Software to take any action that implicates the rights of public performance or public display of a work under copyright law, specifically including making aspects of the Software, including any interfaces used for access to or manipulation of User Data, directly or indirectly available to the public.

The concept of public performance is also implicated in the definition of a "Recipient" of the work. (CAL § 6(n) ⭷) Providing the software to a Recipient requires compliance. A Recipient is defined as:

“Recipient” means any non-Affiliate third party receiving either the Software or a Public Performance of any interface thereof from You.

Maintaining User Autonomy

A second novel aspect of the CAL relates to maintaining user autonomy. At a high level, this concept is a direct descendant of the concept of user freedom from the Free Software Definition. As Richard Stallman wrote:

The freedom to run the program means the freedom for any kind of person or organization to use it on any kind of computer system, for any kind of overall job and purpose, without being required to communicate about it with the developer or any other specific entity. In this freedom, it is the user's purpose that matters, not the developer's purpose; you as a user are free to run the program for your purposes, and if you distribute it to someone else, she is then free to run it for her purposes, but you are not entitled to impose your purposes on her.[8](emphasis added)

Free software is about "user freedom," not "developer freedom." But from a practical perspective, providing users access to source code and the permission to use and modify that code is frequently not enough. Users also need access to their data. This is especially so for SaaS applications: Users may technically "own" the data provided to the SaaS provider, but it is hard to exercise their rights to that data when it is in someone else's hands. The CAL conditions use of the Work on providing access to both the source code and the user's data through the use of several interlocking definitions and clauses.

Definitions: User Data, “Processing” User Data, and Lawful Interests

The first question relates to scope. What is the "User Data"? How is the Software connected to the User Data? Those questions are answered in CAL § 6(q) ⭷ ("User Data") and CAL § 6(l) ⭷ ("Process User Data"), which read:

“User Data” means any data that is either a) an input to, or b) an output from, the Work or a Modified Work, in which a third party other than the Licensee has a Lawful Interest in the data.

“Process User Data” (or “Processing User Data”) means 1) use a system, 2) perform a method, or 3) induce any other party to use a system or perform a method, using at least in part Software provided under this License, where User Data is an input or an output to the system or method.

User Data is a deliberately broad term. It can encompass both copyrightable and non-copyrightable information, so long as it flows into or out of the licensed Software. "Processing" User Data means taking an action that would implicate patent rights in interacting with the User Data. But note the restriction: the definition of "User Data" only pertains to third party data. It excludes the Licensee's own data. The CAL does not attempt to control what a licensee does with their own data. Indeed, it could not do so and be compliant with OSD 9.

But even excluding the Licensee's own data, there is still an effect on other works—the third party User Data. How does this still comply with OSD 9?

The answer is that the CAL only specifies a negative condition: "You must refrain from using the permissions given under this License to interfere with any third party’s Lawful Interest in their own User Data" (CAL § 2.3 ⭷, emphasis added). Declining to extend granted permissions in ways that would hurt users is exactly analogous to the statement in the GPL that "To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights."[9] You can't use the Work to deny others their rights in their own information.

Scoping User Data: Lawful Interest

There is a second scoping issue: It does not make sense to allow any Recipient to access just any User Data. Logically, it should be limited to that User's own Data. But simply asserting "ownership" is not complete enough. For example, I may legally own a copy of a song, but I don't own the song.

Dealing with this issue is the subject of the term "Lawful Interest." Lawful Interest is defined in CAL § 6(e) ⭷, which reads:

“Lawful Interest” means either 1) an ownership interest, 2), a non-ownership property or possessory interest, including but not limited to lawful possession.

The key here is the incorporation of the concept of a "possessory interest," borrowed from (real) property law. Nolo’s Plain-English Law Dictionary defines Possessory Interest as:

In real estate, the right of a person to occupy and/or exercise control over a particular plot of land; distinguished from an ownership interest. For example, a tenant with a long term lease has a possessory interest, but not an ownership interest.[10]

Any Recipient's enforceable rights in their User Data are limited to getting copies of information that they have a legal right to possess.

User Autonomy: Specific Conditions

There are five separate clauses preventing particular attacks on User Autonomy, all included in CAL § 2.3) ⭷. They are:

(a) You may not, by means of cryptographic controls, technological protection measures, or any other method, limit a third party from independently Processing User Data in which they have a Lawful Interest.

This is the most general clause of the five: You can't use the work to inhibit a User's freedom to work with their own data. This is designed to be as broad as possible, and makes reference to both technical ("cryptographic controls") and legal methods ("technological protection measures" of inhibiting user freedom.

(b) During the same period in which You exercise any of the permissions granted to You under this License, You must also provide to any third party with which you have an enforceable legal agreement, a no-charge copy, provided in a commonly used electronic form, of the User Data in your possession in which that third party has a Lawful Interest.

This clause places an affirmative duty on Licensees to make a copy of a User's Data available to that User. Conveniently, anyone interacting with the Software is likely also a Recipient of the Work as defined by the CAL. Thus, this provision applies to essentially the same group that is owed a copy of the source code.

There are a couple of caveats included so as to make this commercially practicable. First, this requirement is only active while a Licensee is exercising the permissions granted. If someone stops using the Software, this requirement also stops applying. Similarly, Licensees must only provide copies of User Data to those with whom they have "an enforceable legal agreement." This is to prevent this User-Data-Download requirement from persisting into the future, even after a user has stopped using a particular SaaS offering.

(c) You may not use the Software to control any cryptographic keys, seeds, or hashes pertaining to third parties where such control would prevent the third party from independently exercising the permissions granted under this License.

The exclusion of cryptographic controls is broader than just encryption. This is one of the ways in which Holo's perspective was vital to the drafting of the CAL. Holochain-based applications use cryptographic keys as core elements protecting and representing a user's identity within the system. In Holochain, controlling someone's key means that you control their data and their identity.

This provision may not apply to everyone right now. But the founders of Holo believe that user-centric, distributed systems are the future. Those systems will necessarily use cryptographic primitives to mediate access to the system. The CAL was designed, in part, to be ready for that future.

(d) You waive any legal power to forbid circumvention of technical protection measures that include use of the Work; and
(e) You disclaim any intention to limit operation or modification of the Work as a means of enforcing the legal rights of third parties against Recipients.

These clauses are modeled after the GNU GPL version 3, section 3, and serve the same purpose.

User Autonomy and OSD 6

One possible concern with the CAL is compliance with OSD 6, "No Discrimination Against Fields of Endeavor." But compare with section 3 of the GNU GPLv3. There is no prohibition on particular business models, fields of use, or geographies. Licensees can form any sort of business that they would like with the software. They just can't lock their users in and deny them their freedom. Or as the CAL would say it, their autonomy.

Other Concepts and Provisions

Most elements of the CAL will be familiar to open source licensing practitioners, and those interested should read the entire license. Below I'll emphasize a the elements that are uncommon or new. The listing is in alphabetical order.

Applicable Jurisdiction

For some organizations, it is important that they have some control over where they can be brought into court. The CAL provides licensors the opportunity to declare an "Applicable Jurisdiction." This is done by just providing notice to licensees at the top of the file, by stating:

“The Applicable Jurisdiction for disputes arising from the licensing or use of this Work is _____.”

Application by Reference

Similar to the OSL and the UPL, the CAL allows licensing by reference. The entire text of the license does not have to be incorporated into source files. This streamlines common usage and matches what many developers do anyway. The CAL is versioned so that there is no ambiguity what terms are contemplated.

The “Combined Work Exception”

The header makes clear that one of the ways in which a licensor can apply the CAL is with reference to a "Combined Work Exception." The Combined Work Exception is a built-in optional clause that turns the CAL into a kind of Affero LGPL—reciprocally licensed, but combinable with other works into a larger program licensable as a whole under different terms. It is similar to the Classpath Exception or the LGPL, but built-in to the core license. All that needs to be done to invoke it is to label the Work appropriately.

The text of the Combined Work Exception is found in the CAL § 2.4 ⭷:

As an exception to the conditions in sections 2.2.1 and 2.2.2, any Source Code marked by the Licensor as having the “Combined Work Exception,” or any Object Code created from Source Code so marked, may be combined with other Software into a larger work, and the resulting larger work may be used, distributed, or sold under any other license, so long as You: a) comply with the notice conditions of section 2.1; b) comply with the distribution conditions of 2.2.1 and 2.2.2, relative to the Source Code provided to You; and c) comply with section 2.3.

The Combined Work Exception can either have a "file" scope, like the MPL, or a "library" scope, like the LGPL. It depends on how the Work itself is marked.

Compatible Open Source Licenses

Most reciprocal licenses enforce reciprocality by requiring that any derivative works be licensed under the same license that they received. Using the same license is allowed under the CAL (and must be under OSD 3). But the CAL also allows those who create Modified Works to distribute their modifications under any "Compatible Open Source License." Compatible Open Source Licenses are defined as follows:

“Compatible Open Source License” means an Open Source License that allows Object Code to be distributed that is created using both Source Code provided under this License and Source Code provided under the Open Source License.

This allows modifications to be placed under most permissive licenses and some weak reciprocal licenses. Unfortunately, the GPL and AGPL are incompatible, because the GPL family of licenses are among those that require licensing of derivative works under "this" license. I would hope that should the CAL be approved, that at least the AGPLv3 and GPLv3 might designate the CAL as compatible.

This "Compatible" provision is particularly important for reimplementation of APIs. Under the CAL, a reimplementation of an API is a derivative work. But the reimplementation can have a completely separate license, as long as it is compatible.

Recipient

The Recipient is defined as follows:

“Recipient” means any non-Affiliate third party receiving either the Software or a Public Performance of any interface thereof from You.

There are two elements of note in this definition. First, there is occasionally confusion as to whether sending to software a co-owned corporation is a distribution of the software. The CAL incorporates a common definition of "Affiliate" (CAL § 6(b) ⭷) to resolve this question (in the negative). Second, a Recipient is someone who perceives a Public Performance as described above.

Comments Welcome

The CAL has been designed as well as I know how, but that doesn't mean that there aren't areas in which it could be improved. Comments and criticisms would be gladly received. Just click the envelope above.

Acknowledgments

Finally, thanks: The CAL is an original work, but has elements inspired by the following licenses (in alphabetical order): AGPL 3.0, Apache 2.0, the GNU GPL (both v2 and v3), MPL 2.0, the NASA Open Source Agreement v1.3, OSL 3.0, the Parity Public License, and the UPL. Special thanks also go out to Kyle Mitchell, Simon Phipps and Jim Wright for valuable suggestions during drafting.


  1. While every effort was made to properly internationalize the CAL, the primary legal analysis was made relative to US law. This post also uses US law and terminology unless otherwise noted. ↩︎

  2. I tend to think that many bare licenses would likely be enforceable without contract formation. But see Mark R. Patterson, Must Licenses Be Contracts? Consent and Notice in Intellectual Property, 40 Fla. St. U. L. Rev. 105 (2012) Available at: http://ir.lawnet.fordham.edu/faculty_scholarship/593. Either way, the CAL should apply. ↩︎

  3. The CAL also includes Database Rights where applicable. Database rights are only recognized in some jurisdictions. To the extent that a jurisdiction recognizes database rights, however, they are usually granted copyright-like protection. ↩︎

  4. "[N]o civil action for infringement of the copyright in any United States work shall be instituted until . . . registration of the copyright claim has been made in accordance with this title. 586 U. S. ____ (2019) at 1." ↩︎

  5. "Use of a copyright or patent to exercise exclusive rights beyond the scope of the government grant. As stated by one court: 'Misuse of copyright applies where the copyright owner tries to extend the copyright beyond its intended reach, thereby augmenting the physical scope of copyright protection. It typically arises in situations where it is alleged that the copyright owner projected his unique rights in a work onto other, unrelated products or services.'" (Religious Tech. Ctr. v. Lerma, 1996 U.S. Dist. LEXIS 15454, 1578-1579, emphasis added.) ↩︎

  6. The Information Infrastructure Task Force (IITF) was created in 1993 by Vice-President Al Gore (through the National Economic Council and the Office of Science and Technology Policy). Its mission was to articulate and implement the Administration's vision for the National Information Infrastructure (NII). The IITF was chaired by Secretary of Commerce Ronald H. Brown and consisted of high-level representatives of the federal agencies that play a role in advancing the development and application of information technologies. ↩︎

  7. Bruce A. Lehman, Information Infrastructure Task Force, The Report of the Working Group on Intellectual Property Rights, at 71 (Sept. 1995). ↩︎

  8. Richard Stallman, What is Free Software, "The freedom to run the program as you wish," Version 1.153, https://www.gnu.org/philosophy/free-sw.en.html, 2016. ↩︎

  9. The GNU General Public License, version 2, "Preamble," available at https://opensource.org/licenses/gpl-2.0.php. ↩︎

  10. Nolo’s Plain-English Law Dictionary, "Possessory Interest," available at https://www.nolo.com/dictionary/possessory-interest-term.html ↩︎

Further Comments on the SSPL

My previous post was discussed on License-Review and was shared on Hacker News, where it engendered some discussion. Some good points were made by various people, which I thought it might be useful to respond to here.

First: Community considerations

First, I should say that I don't necessarily disagree with the goals of the SSPL. There are a number of people that feel that a stronger form of copyleft is necessary. I have no issue with that point of view.

But there is a larger point: Open source licenses are always about communities. You pick a community by picking a license. The license isn't just a legal document - it ends up being part of a social compact between participants in the community.

MongoDB evidently feels that its position is sufficiently preeminent that it can change the terms of the social contract unilaterally. This is an aggressive move. Not only are the terms of the SSPL significantly different than the previous license, MongoDB is also making a statement about how they are going to use their privileged position vis-a-vis the code.

altered_deal

The companies that will be affected by this are members of the MongoDB community - but they don't have to be. Those companies have the right to fork the last AGPL release and leave. I won't be surprised if they do.

Turning now to some of the points that have been made, the key arguments seem to be:

  1. Misuse is a defense to copyright infringement, not a flaw in the license.
  2. Impracticability is only a defense to unanticipated changes in circumstance.
  3. The SSPL doesn't actually require the release of all code, just enough code to replicate the service.

I'll address these below. (The summary headlines are mine; if I don't express the point well, don't hold it against anyone else.)

Misuse is a defense to copyright infringement, not a flaw in the license.

Quoting Heather Meeker on License-Review:

Copyright misuse is an equitable defense against infringement claims. It has been acknowledged in several US circuits but not all of them, and it is not often successful.

This is true. I might add, for completeness, that the corresponding doctrine of patent misuse is slightly out of favor.

In almost all cases where courts declined to enforce a copyright license violation due to copyright misuse, the misuse consisted of anticompetitive behavior similar to actions that would compose antitrust liability.

[snip]

This is not a general rule that imposing any license condition not directly relating to copyright is unenforceable.

The courts have been explicit that misuse may rise to the level of an antitrust claim, but that the bar for misuse as a defense is lower. There is no need to attempt actions that would implicate antitrust liability to bring in the doctrine of misuse.

I mostly agree with the statement that there is "not a general rule that imposing any license condition not directly relating to copyright is unenforceable." However: 1) the scope of license conditions beyond the scope of the copyrighted work generally sound in contract, not in copyright, and 2) attempts to impose additional control on downstream behavior using copyright infringement as leverage is what gives rise to the defense of copyright misuse. Given that the trigger is conditioned on copyright, I maintain that misuse is still an issue.

The basic reason why is because the scope of what the SSPL sweeps in is intentionally vast. For example, assume Amazon was sued under the SSPL. There are large amounts of proprietary shared infrastructure (perhaps all of EC2) that would be swept into the scope of the SSPL under the current language. In this example, the proprietary shared infrastructure encompasses a number of unique works, not directly related to MongoDB, but which would need to be SSPL'd.

Given the key issue of the licensing of other intellectual property, I reviewed the cases to see if there was something closer on point. The closest that I can find on point are a number of cases concerning SanDisk's flash memory licensing program. SanDisk had a patent licensing program that required any licensee to provide a grant-back license to any subsequently-developed patents in the same field of use. Two courts examined SanDisk's program under both the antitrust and patent misuse angles:

  • PNY Techs., Inc. v. SanDisk Corp., N.D.Cal, 2012 U.S. Dist. LEXIS 55965, and
  • Sandisk Corp. v. Kingston Tech. Co. W.D.Wisc, 2010 U.S. Dist. LEXIS 152534

The issue was not fully litigated, but it does seem that forced licensing would be enough to get to court. The court in Footnote 8 in PNY Techs states: "At best, PNY alleges patent misuse through this licensing provision. Complaint 90. While this may suffice as an equitable defense to a patent infringement lawsuit, it stops well short of establishing a Sherman Act violation."

And in Sandisk: "Thus, it is appropriate to consider whether, as a whole, the assorted requirements plaintiff imposes on those who would participate in the flash memory markets are anticompetitive and threaten to harm competition. At this early stage of the proceedings, defendants' allegations suffice....Finally, the licensing terms include cross-license provisions under which plaintiff may use the fruits of a licensee's new inventions. Such cross-license provisions would reduce incentives to create innovative, non-infringing methods that could compete in the flash memory markets because plaintiff would be able to use the innovation."

Of course, we wouldn't know whether the defense actually be successful in court, and these are patent misuse, not copyright misuse, so a court would need to adapt the precedent. But these cases strike me closely analogous.

It is a matter of interpretation as to whether the existence of misuse as a defense, based solely upon the text of the license itself, and not any other conduct, is a flaw in the license. I see that opinion, but just don't agree.

Impracticability is only a defense to unanticipated changes in circumstance.

This issue has been raised in two ways. First, impracticability is not generally a defense to terms that were known at the time of formation of the contract. See, for example, this comment from Bartweiss.

Alternatively, engaging in the license without an intent to comply would constitute unclean hands. See, for example, this comment from Pamela Chestek:

On 10/18/18 8:53 PM, VanL wrote:

the entire purpose of the SSPL is to prevent competition to MongoDB by copies that would otherwise be lawful ...
Van, this is where you're losing me. What are the "lawful copies"? If the licensee hasn't complied with the terms of the license, paragraph 13 in particular, then they don't have lawful copies. You point seems circular to me.

If you're saying that paragraph 13 would not be construed as a condition, then you're in contract territory - and I do agree with that your impossibility argument will often be true. But then query whether the licensee should be taking the license if they know they can't comply. Wouldn't there a counterclaim for that? Fraudulent misrepresentation?

Let's think about the context where this would come up: A party ("Service") takes the SSPL'd MongoDB and implements a service. Service releases some code based on a good faith interpretation of the scope of the release necessary. There is a dispute between MongoDB and Service as to the scope of the necessary code release.

In the ensuing lawsuit, Service raises misuse and argues that the scope is ambiguous. Leaving aside the misuse argument, a court could either a) find for Service, thus restricting the scope of the code to be delivered, or b) find for MongoDB, thus giving rise to an immediate defense of frustration/impracticability, which would undo the contract. The entry of judicial orders can be the intervening event that renders a contract impracticable. (See, e.g. Hicks v. U.S., 89 Fed. Cl. 243 (2009), and see generally Restatement Second, Contracts § 261). However, there would be a good counterargument that the issue was foreseeable, making it less likely that the court would grant the impracticability argument.

So a fair point.

Of course, that doesn't completely solve the problem - as written, this seems to be to be actually impossible to comply with. At that point, it gets into a fight about remedies.

Turning to compliance:

The SSPL doesn't actually require the release of all code, just enough code to replicate the service.

From this comment by metheus:

To reiterate those comments, the SSPL only affects people who are offering the licensed software to the public as a service. This does not include any software that uses MongoDB as a component, even if it's a commercial SaaS offering itself. The FAQ we put out here makes that clear: https://www.mongodb.com/licensing/server-side-public-license.... 99.999% of MongoDB users are not affected by this license change.

People have expressed concerns that the 1) the FAQ is not the license, and 2) the language of the license does not make the intended responsibility clear enough. But it was drafted with that intention (and reviewed by outside counsel, with an eye towards being explicit without giving bad actors loopholes to exploit). Nonetheless, addressing those concerns is extremely important to us. This exact issue is being discussed on the OSI license approval mailing list, and we are considering very seriously all of the feedback.

The article anchoring this thread contains a lengthy discussion of copyright misuse and of impracticability. Those are also the subjects of discussion on the OSI mailing list, where Heather Meeker, writing on MongoDB's behalf, refutes claims that are similar to those made in the article. In particular, the SSPL is not trying to make people release substrate infrastructure, or adjacent tooling, under the SSPL. Consider the last line of section 13: "...all such that a user could run an instance of the service using the Service Source Code you make available." This means that as long as the Service Source Code you release is enough for anyone to run the service, you've fulfilled your obligation. As an example, you would not have to somehow be able to offer CircleCI under the SSPL (an impossibility), as long as your tooling that orchestrates its use is public, because anyone can use CircleCI.

It's our hope that these discussions will lead to an accepted understanding of the actual obligations of the SSPL. The only people we want to be in any way affected by it are those who are literally offering the licensed software as a service, and we want those people to release their management stack under the SSPL. Thanks for helping us with that.

This is a reasonable position. I would construe it as "You only have to release source code to the extent that you are the copyright owner of that source code." But I don't read that as being what the license actually says.

Assume for a moment, the SSPL is updated to state the rule as expressed above. This mostly solves the impracticability issue.

So, would this be enforceable? Perhaps. But the misuse issues remain (intentionally - see the intended consequence being the release of "adjacent tooling" or the "management stack"), but let's assume that this will be enforced purely under contract, not copyright.

The five basic remedies for breach of contract are money damages, restitution, rescission, reformation, and specific performance. So what would MongoDB get? The best outcome for MongoDB would be the lost license revenue (given the existence of a commercial license option) or possibly specific performance of the code release.

But even on further analysis I just can't help coming around again to the misuse issue. As I read the cases, the sine qua non of misuse is the use of a copyright grant to exercise control of works outside the scope of the copyright. This is an intended outcome of the SSPL. Thus my earlier conclusion, "an almost textbook case of copyright misuse." This ends up being such a substantial issue because the existence of copyright misuse has significant implications for contract formation.

MongoDB's Server Side Public License is likely unenforceable

Update: A few more thoughts on the SSPL in response to some counterarguments that have been raised.


A few days ago, MongoDB, Inc changed the license of its widely-used database software to the "Server Side Public License" or SSPL. They also submitted the SSPL for review by the Open Source Initiative, where it is under active discussion.

The key concept in the SSPL is that it is a "super AGPL," designed to make it difficult for commercial entities to build services around the underlying software. This goal was made explicit by Eliot Horowitz, the CTO of MongoDB:

Today, Affero GPL 3.0 uses the broadest scope of copyleft, among the commonly used open source licenses. MongoDB has been making its database software available under AGPL for many years now. AGPL was written to close the “SaaS loophole” by requiring those offering software as a service to make source code available. However, for some kinds of software that is popular for cloud deployment, AGPL has not resulted in sufficient legal incentives for some of the largest users of infrastructure software, such as international cloud providers, to participate in the community. Many open source developers are struggling with a similar reality, and some of our competitors have moved to proprietary licensing models. The alternative, to be blunt, is for us to be that last standing unpaid open source database developer for cloud providers, who sell access to our software for significant fees, but may not adequately contribute back to our community. Faced with the choice of moving to a proprietary model by applying licensing restrictions to our software, we prefer instead to continue using the copyleft model to create a workable incentive for cloud providers to share with the rest of the community.

The Remote Network Interaction provision of AGPL has not provided enough incentive to change the behavior of cloud providers for several reasons:

  • It is not clear that it extends to software that controls the functionality of the database software, such as management, automation, monitoring, storage and hosting software.

  • It only applies if the software is modified, and the definition of a modification references back to copyright principles that are not settled law.

We have addressed each of these concerns in the Server Side Public License, by (i) clarifying that the copyleft obligation applies to those who make the functionality of the software available to third parties, (ii) expressly including management, automation, monitoring, storage and hosting software that is integrated with the functionality of the database software, and (iii) removing the modification requirement.

Most of the discussion so far has focused on the broader community context as well as the overall desirability of having a super-AGPL to assist in certain types of monetization.

All of this misses the key point that the license is likely unenforceable.

The Doctrine of Copyright Misuse

"Copyright misuse" is an affirmative defense to copyright infringement that has developed over the past thirty years or so. For those who may not be focused on this issue, this is the copyright version of patent misuse. It is most often associated with fraud on the copyright office, but it also has a significant set of precedents associated with using copyright to expand the scope of control of licensee behavior beyond the bounds of the copyrighted work.

The key cases for this strand of copyright misuse are Lasercomb America, Inc. v. Reynolds, 911 F.2d 970 (4th Cir. 1990), DSC Communications Corp. v. DGI Technologies, Inc., 81 F.3d 597 (5th Cir. 1996), and probably Practice Management Info. Corp. v. American Medical Assoc., 97 Daily Journal D.A.R. 10221 (9th Cir. 1997) because it marks the adoption of copyright misuse as a doctrine in the 9th Circuit where MongoDB is located. Update: Someone pointed out that MongoDB is headquartered in New York. That's what you get for assuming. In which case, see CBS v. American Soc'y of Composers, 562 F.2d 130 (2d Cir. 1977), upholding a judgment asserting misuse and antitrust violations. End update.

Turning to the text of the SSPL, this is the most directly problematic clause:

“Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available.

This clause is designed to sweep in and force the licensing and disclosure of code that is not the same "work" as MongoDB. But, quoting from Lasercomb:

We are of the view, however, that since copyright and patent law serve parallel public interests, a "misuse" defense should apply to infringement actions brought to vindicate either right.... Both patent law and copyright law seek to increase the store of human knowledge and arts by rewarding inventors and authors with the exclusive rights to their works for a limited time. At the same time, the granted monopoly power does not extend to property not covered by the patent or copyright.

Thus, we are persuaded that the rationale of Morton Salt in establishing the misuse defense applies to copyrights. In the passage from Morton Salt quoted above, the phraseology adapts easily to a copyright context:

The grant to the [author] of the special privilege of a [copyright] carries out a public policy adopted by the Constitution and laws of the United States, "to promote the Progress of Science and useful Arts, by securing for limited Times to [Authors] ... the exclusive Right ..." to their ["original" works]. But the public policy which includes [original works] within the granted monopoly excludes from it all that is not embraced in the [original expression]. It equally forbids the use of the [copyright] to secure an exclusive right or limited monopoly not granted by the [Copyright] Office and which it is contrary to public policy to grant. (Lasercomb at 976-977, internal citations omitted.)

Specifically, there are two broad areas of concern:

  1. Use of a copyright or patent to exercise exclusive rights beyond the scope of the government grant. As stated by one court: "Misuse of copyright applies where the copyright owner tries to extend the copyright beyond its intended reach, thereby augmenting the physical scope of copyright protection. It typically arises in situations where it is alleged that the copyright owner projected his unique rights in a work onto other, unrelated products or services." (Religious Tech. Ctr. v. Lerma, 1996 U.S. Dist. LEXIS 15454, 1578-1579).

I don't think it is arguable that MongoDB is trying to exercise control beyond the scope of the copyrighted work. The question is whether this would implicate the exclusive rights of the MongoDB licensee (the party running the service). In this case, the SaaS provider is likely a copyright holder of a non-derivative-work software used to provide the service. As such, the SaaS provider has the exclusive right to control the copying/distribution and overall licensing of its non-derivative-work software. Forcing the licensing and distribution of the non-derivative-work software is "projecting [MongoDB's] unique rights in a work onto other, unrelated products or services."

  1. The use of a copyright or patent to restrict competition (even if it doesn't rise to the level of an antitrust issue). As described above, the entire purpose of the SSPL is to prevent competition to MongoDB from entities lawfully copying MongoDB's source code.

This is a big one. MongoDB is trying to use the SSPL to make certain types of businesses uneconomical, because those types of businesses are substitutes for MongoDB licenses in the market. This is specifically called out as being against public policy in Lasercomb (quoting Compton v. Metal Products, Inc., 453 F.2d 38 (4th Cir. 1971)):

"The need of [Metal Products] to protect its investment does not outweigh the public's right under our system to expect competition and the benefits which flow therefrom, and the total withdrawal of Compton from the mining machine business . . . everywhere in the world for a period of 20 years unreasonably lessens the competition which the public has a right to expect, and constitutes misuse of the patents." (Lasercomb at 979).

As a practical matter, all this means that the second that MongoDB tries to enforce the SSPL, it is likely to meet with a challenge that goes to the enforceability of the license itself, and not to the scope of the work. Further, if copyright misuse is proven, MongoDB will be prevented from enforcing its copyright against any party until it has purged the misuse by abandoning the SSPL and proven that any anticompetitive effects have dissipated. (Id. at 979, see comparison with Morton Salt).

Impracticability

Another issue is impracticablility (sometimes called commercial frustration). The AGPL can already be problematic in practice, making it so that many companies completely avoid AGPL software. This mirrors the advice I usually give my clients as well.

The reason why isn't just - or even primarily - because the AGPL is designed to plug the "ASP loophole" and enforce reciprocal licensing on server-side code. The problem is that the AGPL moves the time when compliance must take place from the time of distribution - a discrete, controllable event - to the time when someone accesses the software over the network. It is extremely difficult in an enterprise situation to build an ongoing compliance framework that properly takes this indeterminacy into account.

The SSPL inherits this weakness of the AGPL, and goes further, making compliance impossible. The SSPL states:

Offering the Program as a Service.

If you make the functionality of the Program or a modified version available to third parties as a service, you must make the Service Source Code available via network download to everyone at no charge, under the terms of this License. Making the functionality of the Program or modified version available to third parties as a service includes, without limitation, enabling third parties to interact with the functionality of the Program or modified version remotely through a computer network, offering a service the value of which entirely or primarily derives from the value of the Program or modified version, or offering a service that accomplishes for users the primary purpose of the Software or modified version. (emphasis added)

Let's assume for a moment that I intended to run a service and completely comply with the terms of the SSPL. Let's look again at the definition of the "Service Source Code":

“Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available. (emphasis added)

This amazingly broad. For example, this would include deployment scripts and software (e.g., Ansible, Salt, and my scripts) - but I don't own the copyrights to that software, and I cannot release it "under the terms of [the SSPL]."

Let's assume that it is ok somehow to pass forward other open source software, solving that problem. What about my continuous integration software (e.g. CircleCI), or my business backup software (e.g. Jungle Disk) or my code hosting service (e.g. Github)? There is no logical bound to this license. Taken on its face, I would theoretically be bound to release the internal source code of services from third parties that I included in or relied upon to deliver my service.

For anyone thinking that construction of the SSPL as a contract rather than a license saves it, impractibility is a defense under contract.

Update: A couple people have pointed out that impracticability as a defense is based upon a changed circumstance. This is a fair point. See the follow-on post with some responses, including the response that I briefly put here. End update.

Thus, assuming that MongoDB was able to successfully argue past the copyright misuse defense, an accused infringer would then, quite rightly, plead impracticability (or frustration) - with the likely result that MongoDB would end up getting approximately the same amount of code that they would receive anyway under the AGPL.

Reduced patent valuation resulting in securities fraud?

I was looking around and saw the following: Patent Owners Face Increased Fraud Liability Risk The core question posed by the article is simple: If the patent portfolio is a significant part of the value for a company do changes in patent law change company valuations - and can business leaders commit fraud liability by not disclosing those changes?

Individual patent decisions clearly affect company value. Business leaders regularly report on the results of significant litigation or note as a risk the expiration of key patents in a portfolio. The question is whether generalized risks to a portfolio such as recent Supreme Court decisions (e.g. Alice, Myriad, Mayo) or recent legislation (e.g. AIA) create a particularized reduction in value for a company.

I am betting that the answer would probably be "no," but it is just a matter of time before an enterprising attorney decides to try this theory out, and generalized changes to patent law become part of the litany of risks recited by some companies.