His writing is always excellent (to my easily impressed eyes, at least). Recently on boing boing, Cory wrote a lengthy piece on why book publishers should be eager to hop in to bed with Google over the book scanning project instead of suing to stop it.
Here’s how GBS works: Google works with libraries to scan in millions of books, most (more than 75 percent) of them out-of-print, some out-of-copyright and some in-print/in-copyright. Google scans these books, converts the scanned images of the pages into text, and indexes the text.
This index will be exposed to the public, who will be able to search the full text of tens of millions of books — eventually this index could comprise the majority of books ever published — and get results back reporting on which books contain their search-terms.
For public domain books, the search-results will contain a link to the whole text of the book. These out-of-copyright works are our collective human property — or no one’s property at all — and Google is perfectly within its rights to distribute copies of any public-domain book that matches a search-request. As an author, I would love to be able to get the full-text of books that matched my search-queries.
For other books — the books that are in copyright — Google will show a brief excerpt: a single sentence with one or two sentences from either side of the the match. In some cases, publishers or other copyright holders have granted Google permission to show more than this — a couple pages — and Google will show you this, too.
The full article is about 20 times longer.
[tags]Cory Doctorow, Google Book Scanning project, Google, Boing Boing[/tags]