An interview with Greg Newby (CEO and Director of Project Gutenberg)
Greg Newby—Director of the Arctic Region Supercomputing Center; an advocate of ethical hacking; and an avid dogsled musher—is also the volunteer CEO and Director of Project Gutenberg, a collection of over 40,000 free eBooks. Project Gutenberg got its start in 1971 when founder Michael S. Hart made the Declaration of Independence available for download on an early computer network. Hart died in 2011, and Newby, who has run the Gutenberg archive for many years, spoke with me in December 2012 about his friendship with Hart, their joint advocacy of “everyone their own library,” and the history and future of Project Gutenberg.
Two minutes before we were scheduled to talk, the doorbell rang at my home in the D.C. suburbs. I opened the door, and a woman–a stranger–standing in the dark with a bag of books looked behind me and said, “Oh, I see you have enough books.” She walked away, and I locked the door and called Newby at his University of Alaska Fairbanks office.
— Ingrid Satelmajer
I. Down the Rabbit Hole
THE BELIEVER: I thought I would ask you about your early involvement with Project Gutenberg—how’d you become involved?
GREG NEWBY: I became aware of electronic books when I was at SUNY Albany. I was there for my master’s degree, around 1986. Somewhere in that period my buddy emailed me a copy of Alice’s Adventures in Wonderland—the Millennium Fulcrum Edition. So I was aware of electronic books and followed them a little bit; I was involved with digital culture. These were pretty early days, the 1980s, when I was on one of the early computer networks of the day. Then I got my first professor job at the University of Illinois in the Graduate School of Library and Information Science in the fall semester of 1991. And there was a local newspaper article where they interviewed Michael [Hart], a local personality. Someone brought it to my attention and I said, Oh! This is the guy that did Alice in Wonderland! So I called him on the phone and we got together. We became friends, and I got involved very immediately with doing stuff for Gutenberg, then I got more involved over the years and today, I’m essentially the guy. Everything’s landed in my lap.
BLVR: Did you ever type in a text?
GN: I actually did two early ones. The whole book I did was #52, Rene Descartes’s Discourse on Method, around ’93. I did that book on an Apple computer that actually had been donated by the Apple Company to Project Gutenberg.We managed to fry the hard drive. I also did The Gift of the Magi and a couple of other Christmas stories. Those I typed because they were very short. The Discourse on Method I scanned. I still do scans occasionally, and I still get people sending me physical books, so I’m definitely still involved on every level, but I haven’t done a whole book on my own for a long time. Since we use proofreaders, a lot of people don’t do a whole book anyway. They’re just doing their component of a book, as sort of a crowd source thing.
BLVR: Yeah, I know it’s a system where people are being asked to proofread a page a day. When Michael Hart was alive, did you—I assume you saw his famous workspace with the tower of books.
GN: It took about a year to clean it out. It’s unfortunate, really, because in the early days his workspace was pretty functional. Over the years—he was basically a hoarder—things got so cluttered, they were actually dangerous in some places.
BLVR: Is there anything being done with the books that were in that space? Are they being donated to a library or something?
GN: We did send off a bunch to the Internet Archive to be scanned. Those are primarily reference materials: a bunch of Alices, encyclopedias and related items. Those are things that Michael thought should be preserved. He was really big on tracking revisionism, the sort of whitewashing of the past. He insisted on keeping those earlier editions because you can actually see what they said at the time, as opposed to the way that we interpret it years in the future.
BLVR: I’ve seen some of the interviews where he’s talking about not wanting to exclude certain documents out of a kind of editing or censorship impulse—
GN: We were, every once in a while, confronted with the decision about whether to accept or not accept something. But usually there was no decision at all—it was, like, of course we accept it. Of course.
BLVR: Do you still use a lot of old computer technology? I know there was a big fascination, or a very strong alliance, with the plain vanilla ASCII.
GN: Right. I actually have sitting here in my office a computer that doesn’t work, one of the early ‘90s servers. We actually ended up buying him a new laptop every couple years because he discovered that laptops were a lot better. The main idea he had is that people shouldn’t need a high-end computer or specialized software just to read a book. And that’s really served us well. But the secondary part of that idea is that the cool new digital format of today may not be with us tomorrow. PDF has endured pretty well, but many other digital formats have come and gone and are not compatible anymore. In the case of some of the old html, for example, we do things so differently these days than the way we did them 15, 20 years ago that it looks clownish by comparison.
BLVR: I know that another of Michael’s big goals was to have this whole library transportable on a disc.
GN: That’s something we did in later years which resulted in our CD ISO Maker, which makes a custom disc for you. We always said everyone should have whatever books they want. The idea was they could make their own little collection and carry it around with them.
II. “Hey, books are good.”
BLVR: In one of Michael’s essays, he says himself he often didn’t know who the earliest volunteers were. Texts would essentially just show up. He describes a final entry for a text showing up from Hawaii the night his father died, and it had all this significance because he wanted it done on that day. I imagine he was communicating with people through early online methods and that they were just doing this on their own.
GN: I think the short answer is emailing. This was the 1980s. People were using email to communicate, there were some early bulletin board systems and some early versions of what would eventually become the web. People heard, somehow, of Gutenberg. People were writing magazine articles asking, “What is this computer network thing?” Through a variety of mechanisms people came to Michael’s attention via email.
BLVR: It’s interesting hearing your own story about Alice’s Adventures in Wonderland showing up in your email from your friend. It seems that there was a kind of delight in just being able to share things in this way. I graduated college in ‘92 and I keep trying to figure out when I was using email.
GN: When I was studying at Syracuse, there was a professor there who mentioned that, in the U.S., no one really remembers their first telephone call, but most people remember their first email. Now, I think that’s changed. I think a lot of people don’t remember their first email anymore—because it happened as part of the whole flow of the net, and often before you were fully an adult.
BLVR: Can you talk to me a little bit about idealism versus pragmatism with the archive?
GN: Well, when you look at the collection for the first #100 or so, what you’ll see is a lot of opportunism. So it’s like, okay, someone did a book of some sort and they released it on the Internet. The Hitchhiker’s Guide to the Internet was an early one. Someone did a book, it’s in digital form, and they’re willing to provide it to us. Fantastic. We’re not going to ask a lot of questions about whether it’s literature or if it should go in the same collection as Shakespeare. We’re going to say, Hey, books are good. We’re not going to make decisions about what books you should read or give guidance to parents. We’re going to try to collect available literature that can fit on the site. For example, the human genome project in the 90s. It was really cool but it was a stretch for Gutenberg. I was the one that edited a bunch of those files, and I had to use some pretty hefty systems just to process them. We felt that making it available was important even though it was kind of a stretch, and also not exactly literature.
BLVR: Is copyright still one of the largest pragmatic issues that you deal with? Is volunteer interest an issue?
GN: It’s a more complicated environment that we live in these days. The fact is that Project Gutenberg—with almost 45,000 books—is small potatoes compared to what you can find in Amazon and in the Google Books and in the Internet archives etext collection, yet we still have a tremendous brand name recognition and a higher level of processing. The ebooks that we produce are just more proofread and better suited to different formats. They also don’t lock you out in various ways like some of the Amazon or Kindle books do. We think there’s definitely a signature quality that Gutenberg adds. Everyday I hear from people who just came across Project Gutenberg, who never knew that someone was giving away literature. I kind of feel sorry for those folks because they probably went out and paid twenty bucks for Pride and Prejudice in the Kindle Store.
Michael and I were all about “yes.” And the worst part about people getting upset about formatting, about all caps being used in Twain texts rather than italics, is not that someone says “This is trash, I absolutely won’t read it, it’s ridiculous, no one should be reading, even doing this type of thing.” The worst part is they say, “Therefore, you should prevent other people from reading it. You should remove it”—you know, expunge it from the record. Michael and I’s answer was, “You need to get your own library. We’ll give you your own domain name—something dot Gutenberg dot org and you can go—and you can host your version.” We were always about yes. We’ll give you the power to do anything you want except to stop other people.
BLVR: I was curious about to what degree you see Project Gutenberg being used as seed work for larger projects.
GN: Project Gutenberg runs many servers. There are also mirrors all over the world. So there’s lots of other places you can get our stuff. And we’re not jazzed by increasing our download counts. We’re not trying to consolidate and say, Okay, we have the most popular something. Getting stuff out there is the key, and the idea that you can use our stuff to make what you would call derivative work is part of the intention. That really does speak to why we like plain text and HTML and other editable formats; those are open for various types of re-use. Unlike, say, a PDF where you can’t copy and paste, which recently happened to me. That’s the sort of annoyance that we’re incredibly against. And in fact, if you look at the small print, it basically says if you want to do stuff like that, then you can’t use the Project Gutenberg name.
BLVR: You can’t do anything that’s restrictive.
GN: Yeah. You can’t create a format that removes some of the rights that we have given you at Project Gutenberg. At least not while crediting us. People still do it, though. There was one guy that has taken some Gutenberg stuff and sold them as for-fee ebooks on Amazon, occasionally leaving our header in. People come and complain to me and they say, hey, I just bought this and it says that it’s free. What’s up? And we have to explain that there are some unscrupulous people out there.
III. Michael Won
BLVR: I’m wondering what you see as the biggest threat to Project Gutenberg, and I find myself thinking about things like SAT reading scores falling to their lowest levels since 1972. Do you remain committed to books, or do you see the effort as needing to go elsewhere?
GN: As far as Project Gutenberg goes, if we never got any further books and the stuff we did just stayed where it was and we didn’t get a lot more hits, that would not be a tragedy. I like running a big, popular site, but there’s also not a shortage of opportunities for literature out there. We’ve done some really innovative stuff, of course, but there’s not that much that we’re doing that’s truly unique at this point. Michael and I often discussed steps to follow Project Gutenberg; self-publishing is one of those, and we have finally launched it. And we’re also looking very closely at what we think is going to be the next attempt to extend copyright, and ways to prevent that from happening. The struggle carries on, and we can just try to do the best we can.
BLVR: I know Michael used to say “a million books by 2015.” Do you see that part of the project in a kind of winding down phase if it becomes more about self-publishing, and that Gutenberg will become about maybe being advocates or leaders of a certain kind instead?
GN: I don’t really know. As far as the million—you know, he won. He was right. If you include the Internet Archive, and Google books, and Amazon books, and the various other sites for e-books, and the self-publishing efforts that are going on by individuals, there are millions out there. We don’t have to own all the marbles or whatever. It’s great and appropriate that other people do things differently and that there’s a whole bunch of good things going on that maybe, in some way, Michael started. You can trace a lot of this back to his vision. But who cares. Who cares who gets the credit? That was something he always said very clearly: if he were more worried about getting the credit or consolidating or being the single source for whatever, he would have been a lot less successful because he would have spent effort doing that instead of just spreading the good. As far as what I mentioned before about saying yes. You don’t say yes, but you have to do it my way, or yes but it has to live on my site or yes but it must be plain text. You say yes. You know. Do it. And if you do it in a way that’s consistent on a basic philosophical level which we wrote up on the site, then we’ll encourage and link to and praise you and stuff like that. And we don’t care if you don’t like some aspects of the way that we’re doing it or someone else is doing it, you just don’t get to tell them no. So I think that success is there. I gave a public talk a little over a year ago, maybe two months after Michael died. That was basically the essence. Michael was right. He won. He talked about public domain; he talked about people having ebooks so they could carry around their own library. The whole industry of the readers and distribution, it sure isn’t what anyone expected—in terms of Amazon, the iPad, the Nook, and all that stuff. But he won. I mean, he was right. And all that stuff has come to pass.
Ingrid Satelmajer wrote about the YouTube Bible for the Believer in September 2012.
Collage from uncredited photo by Ingrid Satelmajer. Photo editing by Jim Wehtje.