Friday, June 6, 2014

You Got a License for that Thing?

Hello everyone! First an introduction and disclaimer: I have no credentials yet as an author of open textbooks of the sort that most people here are writing, nor have I yet even used one of them. But I was one of the principal authors of the homotopy type theory book, which is sort of a cross between a research monograph and a graduate textbook, and was developed as a group project on github and released freely.

In this post, I'd like to start a conversation about licensing, beginning with an overview for those who may be new to the options. With good reason, most open textbooks seem to be published under a Creative Commons license. The most permissive CC license is CC BY, which basically allows others to do as they wish with a work as long as they give credit to the original author. The other CC licenses add additional restrictions, which can be chosen from the following:
  • ShareAlike (SA), which requires anyone who modifies a work to release the modified version under the same license terms. This is roughly comparable to the "copyleft" provisions in open-source licenses such as the GNU GPL.
  • NoDerivs (ND), which allows redistribution of a work (with credit) but prohibits any modifications.
  • NonCommercial (NC), which prohibits use of a work or its derivatives for commercial purposes.
Of course, it doesn't make sense to combine SA with ND, but other than that, any combination of the above is available.

Now when choosing between the available licenses, of course we ought to think about our goals. Here are some that seem relevant to open textbooks:
  • We want our textbooks to be available for free (as in beer). This can be accomplished with any CC license.
  • We want to get credit for our work. This can also be accomplished with any CC license, since they all include BY.
  • We want to avoid the need to reinvent the wheel. E.g. it's ridiculous for every author of a new calculus textbook to have to come up with all new exercises since all exercises in existing books are under copyright. This argues against the ND clause: we should allow future textbook authors to incorporate our work into theirs. I haven't seen any open textbooks that use ND, so there seems to be fairly wide consensus on this point.
  • We want to break the grip of expensive conventional publishers on the textbook market. Making good textbooks available for free, and avoiding wheel-reinvention, are good starts on this. But an additional worry we might have is that since we're making our work available for free, a commercial publisher might take what we've done, modify and "improve" it a bit, and then sell the result at an exorbitant price. How can we prevent that with a license? We could choose a NC license, preventing anyone from using our work for commercial purposes. Or we could choose a SA license, ensuring that anyone who modifies our work must also make their modified version freely available. Or we could combine the two.
  • We want to spread the philosophy of openness. This argues for SA: we should ask anyone who modifies our work to also join the "open" community.
These goals seem to argue for using either BY-SA or BY-SA-NC. Now, perhaps surprisingly, these two licenses are completely incompatible with each other. If Alice licenses her book under BY-SA, then any derived work must also be licensed under BY-SA; and in particular, it cannot be licensed under BY-SA-NC. This makes sense: Alice doesn't want derived works to add more restrictions on the use of the material than she did, and a prohibition on commercial use is an additional restriction. Similarly, if Bob licenses his book under BY-SA-NC, then any derived work must also be licensed under BY-SA-NC; and in particular, it cannot be licensed under BY-SA. This also makes sense: Bob doesn't want his work to be used for commercial purposes, so he doesn't want any derived work to be used for commercial purposes either, whereas BY-SA would permit that.

However, this means that if Eve wants to create a derived work incorporating elements from both Alice's book and Bob's book, she is stuck: there is no license that she can give to it. Therefore, in the interests of interoperability and avoiding reinvention of the wheel, it might be in the interests of the open math book community to establish conventions about which of these licenses is preferable. In other words, should commercial use of open math books be allowed?

I've noticed that a number of prominent open textbooks use BY-SA-NC. However, there are good arguments to be made in favor of BY-SA instead. The web site Freedom Defined has a very detailed writeup, which I recommend reading. I'll just mention briefly a few of their points:
  • The basic free software licenses, such as the GPL, permit commercial use, and for good reasons.
  • Because BY-SA-NC is incompatible with BY-SA, it makes your work incompatible with anything that uses the latter license. For instance, you can't incorporate content from Wikipedia.
  • SA alone is sufficient to prevent exploitation of our work by evil publishers, since anyone who makes a derived version and sells it must also make it freely available under BY-SA.
  • NC prevents beneficial commercial uses. For instance, if someone in the developing world wants to print copies of our textbook and sell them at a small profit to people without Internet access, NC prevents it.
I do recommend reading the whole thing. Then let's discuss in the comments. Is there a good reason for choosing NC?

Michael Shulman
Department of Mathematics and Computer Science
University of San Diego


  1. This comment has been removed by the author.

    1. (Don't put dollar signs in your posts if you aren't using them for MathJax!)

  2. Thanks for the thoughtful post, Michael. I have always found the NC choice odd, and have argued against it when asked for advice about licenses.

    I surmise that most authors do not like the idea that others (publishers, evil or otherwise) could make money on their work. But how much can they make? The print-on-demand version of my linear algebra book is priced so that I receive 5 dollars per copy, which helps with my costs (ISBN number, promotional copies, etc). And I am very up-front about this. But you could go through the trouble of arranging something similar and undercut my price by 1 dollar. And then someone else could undercut you. I guess the economists would argue that the price will be driven to the point where the profit is zero, or to where the profit equals the cost of the fiddle to manage the business arrangements.

    Lon Mitchell first made Tom Judson's abstract algebra textbook available in print with litle or no advance communication with the author. Of course, this was not required since the book has a GFDL license. I considered it a service that I was able to announce on the first day of the class that hardback copies were available for $20 on, with no effort by Tom or me. I have seen Jim Hefferon's book for sale on Lulu at what looked like a profit, and seemingly not sponsored by Jim. I'm not seeing that particular case anymore, but this might be similar:

    I see open licenses as less about remixing and more about promising availability. If the source is free to copy and store indefinitely, then the book never goes out-of-print, nor can a conversion to traditional copyright prevent future use. So I see an open license as a promise to teachers - you can build a course around this text and it will not go away. There are examples of "free-to-download" books that have subsequently been placed under more restricive terms, such as:

    I think it is a lot harder to remix textbook content than software code or music. Have you ever tried to wrangle somebdy else's TeX macros into your work? Or merge somebody else's choices for notation?

    I have been very comfortable of my choice of a license that allows commercial use. If somebody finds a way to make my work more broadly available, and it requires a commercial approach to be viable, then I am all for it. I might even help. But so far, that hasn't happened.

    1. Thanks for your comments! You have a very good point that "free as in beer" right now isn't enough; we need a promise that the content will always stay free, which CC licenses also provide.

      Have you ever tried to wrangle somebdy else's TeX macros into your work? Or merge somebody else's choices for notation?

      That's another very good point, and worth a whole separate blog post. In the HoTT book we made a strong push for using only semantic macros, i.e. if you want to use some notation for some concept, define a macro named by the concept and use only that macro. Then if you later, or someone else, want to use a different notation for the same concept, it's easy to change. We did this mainly because we had a hard time coming to consensus on notation, and we didn't want to postpone writing the whole book until we reached a decision; but it would serve equally well to make a textbook "remixable" into another book by matching the notation. If everyone used only semantic macros, then you could paste someone else's TeX code into yours anywhere you want, and just add a few lines to your preamble redefining their macros in terms of yours.

  3. Perhaps I'm not reading the CC licenses correctly, but do any of them require the textbook's source code (LaTeX in most cases) to be made available? The CC-BY-SA license, for example, doesn't seem to say anything about that. Again, I could be wrong.

    That's why I (and many other authors) have chosen to use the GNU Free Documentation License (GFDL), which does include some stipulations about availability of the textbook source code (called a "Transparent copy" of the book). If someone were to try to take my books and make money off them (e.g. by selling more than 100 copies), then they'd have to make the source code available under the GFDL. While that might not deter such people, at least the public would in theory be able to take the source code and make their own copy, for free. That's my understanding of the GFDL, anyway.

    I have received many requests over the years from people who are presumably instructors to let them adapt my books to their specific needs, and I always give them my permission - which they really don't need as long as they follow the terms of the GFDL (which I make sure to remind them of). So far there haven't been any violations of the GFDL, from what I've been able to determine. I think the bigger reason is that my book and its source code are and always will be freely available, which undercuts any attempted scams. Even the printed versions I've set up at undercut anyone else trying to sell my books at a higher price, since I've set the prices at the bare minimum.

    1. That's an excellent point, and one which I hadn't even considered. I do think that CC BY-SA ought to have the same effect on commercial use described in your second paragraph: even if it doesn't require distribution of the source code, it requires distribution of whatever "derivative work" is created and released, so that people would be able to get copies of the published version freely. The only issue is that without the source code, they might not be able to incorporate whatever modifications were made by the publisher into a "second derivative" work.

      I'm not sure how important I think that is. As Rob said, it's pretty hard to use someone else's LaTeX in the best of situations, and an otherwise unfriendly commercial publisher is unlikely to do anything very helpful with their source code like using semantic macros.

      One question I have about the GFDL: is it possible to apply it to a work for which no "Transparent" version exists? For instance, I know of some cases in which authors of older out-of-print textbooks have chosen to make them available under a free license, but the only version they had available was a scan of the printed version.

      Overall, I would agree that the GFDL is at least as good as BY-SA. Unfortunately, they are also incompatible with each other!

    2. Hmm... for some reason I thought that BY-SA requires derivative works to be freely-as-in-beer available, but that doesn't seem to be the case at all: it would be completely legal for the only copies of a BY-SA work made available by the creator to be physical printed ones sold commercially. (Right? I'd love to be wrong about this...) Of course, once you had such a copy, you could do anything you like with it, such as scanning it, posting the scans online, or even putting them on Lulu at cost.

      But that does mean that SA (with or without NC) is pretty toothless as far as actually ensuring the possibility of second derivatives. I'm feeling pretty disappointed in CC about this; it seems like a fairly basic requirement that if you're going to be modifying something, you need to be able to start from an editable version of it. This doesn't make sense for all "content", but it clearly makes sense much more widely than for textbooks and software documentation.

      Apparently one drawback of the GFDL is that it requires all printed copies (even small excerpts) to contain the entire text of the license, rather than merely a reference to where it can be found. Perhaps not such a huge issue for textbooks, but maybe if you are copying and handing out separate chapters...

    3. I agree with all of the above---and have been using the GFDL for many years now.

      I do not know if still does it for new books, but, inasmuch as I gave up *my* royalties, they gave up *their* fees so that my 337 pages book still sells for under ten dollars.

      While you can also freely download the pdf from Lulu's site, for the source you have to go to


  4. I have thought about this a fair amount, and I still think BY-SA-NC is the best choice of the CC licenses. I do wish it were possible with a CC license to require sharing source code, but that's not enough to make me switch to the GFDL and dump the NC.

    "The basic free software licenses, such as the GPL, permit commercial use, and for good reasons."

    But books are not the same as software.

    "Because BY-SA-NC is incompatible with BY-SA, it makes your work incompatible with anything that uses the latter license. For instance, you can't incorporate content from Wikipedia."

    Because BY-SA is incompatible with BY-SA-NC, it makes your work incompatible with anything that uses the latter license. For example, I used material from Keisler's calculus book in mine, but I couldn't have done that if my license were BY-SA.

    "SA alone is sufficient to prevent exploitation of our work by evil publishers, since anyone who makes a derived version and sells it must also make it freely available under BY-SA."

    This is not clear to me. Suppose a company can produce a beautiful color version of the Whitman Calculus book and sell it at a reasonable price. Color printing at Lulu is very expensive, but even if they can't undercut that, the company may have enough marketing and distribution clout to make a lot of money. I want to have something to say about that at the least, and perhaps get a share of the profits as well.

    "NC prevents beneficial commercial uses. For instance, if someone in the developing world wants to print copies of our textbook and sell them at a small profit to people without Internet access, NC prevents it."

    Again, not clear. It is always possible to negotiate different terms with the authors. Just because I distribute under BY-SA-NC doesn't mean I can't grant a commercial license to someone, as I understand it.

    1. David, how do you feel about the GFDL? My understanding is that not only does it require distribution of source code ("Transparent Copies") with any mass publication run, but with it you can require that certain "Invariant Text" be left alone and/or that certain "Cover Text" appear on the front or back cover, in any modified version. So one could, for instance, include an Invariant Text and/or Cover Text saying something like "This textbook is available for free at (URL), and if you paid a lot of money for it from a commercial publisher you probably got gypped and none of the money went to the author."

    2. I don't think that addresses the commercial color version scenario. There is clearly still a big market for printed calculus books in university bookstores. Maybe that will diminish, but I think it will be around for a while.

      I should say that I think it's very unlikely that anyone will want to launch a commercial venture based on the book, but unlikely things happen, and I want to have something to say about it.

    3. Can you explain what the problem is that you see with that scenario? I.e. what exactly do you want to get to say about it?

    4. Since I don't think this is likely to happen, I haven't spent a lot of time thinking about what my response would be to a request for a commercial license. Off the top of my head:

      What will it be sold for? Who will make money? How much? Will source be available for any enhancements, like graphics? What formats will the book be available in?

    5. We seem to be talking past each other, so maybe we should let it drop.

  5. Excellent! I'm glad someone showed up to defend NC; now we can have a real conversation.

    First of all, you're absolutely right that the second point I mentioned from FD basically boils down to "BY-SA and BY-SA-NC are incompatible", which by itself is not a reason to choose either one over the other. I think the way they're regarding it as an argument for BY-SA is by saying that choosing NC will cut you off from a particular large corpus of BY-SA material on Wikipedia, for instance.

    You're also absolutely right that an author is free to negotiate alternative license terms with specific users. However, the FD article addresses this point in the same paragraph where they mention the developing world:

    Given cultural, language and access barriers, the common argument of -NC proponents that permitting commercial use on request is sufficient to allow for desirable uses, is at odds with reality.

    I think there's a valid point there. I found a similar argument here, along with a couple of others.

    It seems to me that the main argument for NC is of the sort that you give in "Suppose a company". I need to think about that some more; discovering the gaping hole that SA doesn't require distribution of source code has shaken my opinions up a bit.

  6. Here are a few more links I've been reading:

    The problem is enclosure (scroll down to an imagined worst-case scenario about "Omniplex")

    It's hard to figure out what NonCommercial actually means

    Some guidance from CC on that point

    Another option: dual licensing