(posted by Peter Hirtle)
Lorcan Dempsey has an interesting blog posting about new publishing ventures in libraries. He mentions in passing an article in the Economist discussing the Google Book Settlement that opens with an example of two "orphan works." One is The Appalachian frontier: America's first surge westward by John Anthony Caruso; Lorcan found three editions of this in WorldCat. The other work was Blunder out of China. There are two works with that title in WorldCat: one by Geraldine Townsend Fitch and Theodore H. White published in 1946, and one by Walter Henry Judd published by GPO in 1947.
The inclusion of these two examples as "orphan works" in the Economist article illustrates a number of interesting points. First, it shows how sloppy some people have been when they have referred to "orphan works" and the Google Book Settlement. Neither of the titles that the Economist cites could be considered to be an "orphan work." Second, it also points out that metadata problems aren't confined to Google, and how difficult it is to determine copyright status just based on electronic records. Finally it is a good example of how the Google Book Settlement is the only way that the full text of most books can be made available.
If one checks the Copyright Office's database of registrations and renewals, there is no evidence that copyright in the 1959 edition of the Caruso book was renewed in its 28th year, which puts the text clearly in the public domain. There is a copyright registration for the new matter added to the 2003 edition, and the claimant for that copyrighted matter is the University of Tennessee Press. The press can be found with relative ease, so you can't claim that the 2003 edition is an orphan work.
The Blunder out of China example is somewhat more complicated since we don't know which edition Paul Courant Duguid had in his office when he was interviewed for the article. The Economist doesn't help much. It refers to the book as a "seminal text," but I wouldn't describe a pamphlet implying that Theodore H. White was a communist as fitting that description. For copyright purposes, though, it doesn't matter which of the two books in WorldCat was digitized since they are not different works. The Judd work is an offprint of the Congressional Record of 26 July 1947, where Judd reprinted Fitch's pamphlet in its entirety. (The difference in page count between the two editions is due to the small type that the Congressional Record used for congressional "extensions of remarks.") There is only one copyrighted work: Fitch's pamphlet.
So let's look at the metadata problems. First, the WorldCat record gives Theodore H. White as an author. Authorship can be important in determining copyright status. White was not the author of the pamphlet, however, but rather the subject. The pamphlet is an attack on his book, Thunder out of China, which was published in 1946. Second, the WorldCat record gives a publication date of 1946. Publication date can be important when checking renewal records. I have not been able to confirm the date of publication of the pamphlet, but the Library of Congress record has [1947?]. Cornell's copy was received on 26 May 1947 according to a penciled notation in it, and the fact that a Congressman felt the pamphlet was important enough to include in its entirety in the Congressional Record in late July, 1947, suggests to me that the publication date was 1947, not 1946. (If someone really cared about this, they might want to check Judd's files on the republication of the work found in his papers at the Minnesota Historical Society.)
Now on to the physical book. A check of the volume shows that there is no copyright notice in the volume, which would have injected it into the public domain as soon as it was published. Furthermore, there is no registration record for the pamphlet in either 1946, 1947, or 1948, nor is there a renewal record for it, further confirming that it entered the public domain when it was published.
The one last issue would be whether copyright could have been restored. For that, we need to know where the item was published and the nationality and domicile of the author. Geraldine Fitch was, I believe, an American married to a missionary and living in China at the time the pamphlet was printed, so there is the possibility that copyright in it could have been restored. The pamphlet, though, states clearly that it was "printed in the United States," so copyright restoration is not an option. This work, too, is in the public domain.
Why does this experiment suggest that the Google Book Settlement is needed? Look at the effort it took to establish that these two works were in the public domain. The same effort would have been required to determine that the books were still protected by copyright but out of print. One would then have to make a reasonable effort to locate the copyright owner before the work could be considered to be an orphan work. (Only Brewster Kahle, as far as I can tell, believes that an out-of-print work is an orphan work.) The expense of doing this would be horrendous.
If we want to make the full-text of in-copyright but out of print books available without massively altering copyright law, there is only one cost-effective option: the settlement.
Eric, I would love to see copyright law changed. My personally preference would be to have everyone admit that joining Berne was a mistake, and that we need a copyright system based on formalities and the active exploitation of copyrighted works. Otherwise, they become part of the commons that we are all free to draw on.
I don't think for a minute, however, that Congress would ever consider such an option. All we have seen for the past 30 years is ever-greater control over copyrighted works given to copyright owners. The fact that simple orphan works legislation (which would still have required expensive copyright investigations) couldn't pass is a sign that we could never get a legislative response to the problem Google tried to address. And orphan works legislation would not affect the bulk of the out-of-print books included in the GBS that are not orphans, but merely out of print.
I should add, too, that there are lots of things that I don't like about the settlement and that I hope the court can fix. The idea that royalties from works whose authors have not elected to participate in the Registry will be distributed to others strikes me as particularly unfair. I wish that Google had included its library partners in the negotiations with the authors and publishers - the libraries might have been able to highlight those elements of the settlement that were not consumer-friendly. I hope as well that critics of the settlement will push Congress on both the idea of a compulsory license as well as orphan works legislation. Together, they might obviate the need for the settlement. (And Congress has in the past changed copyright law to overturn judicial decisions with which it did not agree.) But until we get a compulsory license (and see what sort of costs are associated with it), I don't see any other way that the full-text of out of print books can be made available. The settlement is far from perfect, but it is better than any other option that is likely to be implemented.
Posted by: Peter Hirtle | September 14, 2009 at 07:20 AM
Peter,
With respect, I don't see how your argument supports the two implications made in your final sentence: that we don't want a legislative solution to orphan works; and that no other, equally (or more) cost-effective solution is possible.
It is certainly true that in order to put full-text of all such works online would require cost-prohibitive copyright research if there were not legislation or *a* settlement. But that does not mean that *this* settlement is the only theoretical, or even practical possibility. The negotiators *could* have come up with settlement language which would have assuaged many of the concerns (though probably not the anti-trust charges), but they did not have an interest in doing so. When I hear the two parties (Google, the AG) pushing this as though it is the only option, it starts to sound like a hustle to me. There are always other options, even if the litigants don't want to try and find them.
Besides, I disagree with the notion that a legislative solution is either not desirable or not possible. For my part, I would greatly favor "massively altering the copyright law" over moving an entire class of works to a license regime, especially one in which neither of the negotiating parties has the public's interest at heart. Legislation would allow anyone, not just Google, to benefit from the availability of Orphan Works, and would remove all of the anti-trust, and many of the privacy concerns. I will grant that legislative solutions have been tried, and have failed so far, but it does not follow that legislation is impossible (could the settlement language itself be ported to legislation?). And it seems to me that Legislation is far more likely to have support when big players like Google are frustrated in their efforts.
My two cents anyway.
Eric Harbeson
Posted by: Eric | September 12, 2009 at 03:24 PM
Thanks for this great analysis.
My impression was that the issue was that the settlement gives Google, de facto, the exclusive ability to earn income based on the distribution of materials they have scanned, many of which appear to be orphaned works.
Google's approach is to assume that every work they happen to have scanned falls under the settlement, unless there is an obvious reason it doesn't. It is to only their advantage that they act this way.
It is a bit like not acting on internet copyright violations until the copyright holder complains - sure, it is within the law, but it is sloppy ethics. Except in this case, we're waiting to hear if what Google proposes is within law or not.
I don't see your argument that establishing copyright status is too expensive. That's exactly what people said about scanning books in the first place, and look at us now.
Posted by: caleb | September 09, 2009 at 03:29 PM