View Full Version : book digitization
Punk_Rocker_22
2008-09-15, 18:37
I've made it my mission to never buy another textbook.
This semester in Uni three of my books are available online for free, two of them I was able to find a pirated ebook of, and two of them I still don't have. I've just been borrowing copies off other people for short periods of time.
I think its time to take those borrowed books and convert them into ebooks.
I'm also looking to spend zero money on the process.
I have a DSLR camera, a tripod, a cable shutter release, a lamp, and I'm sure I can find a piece of glass.
That would leave me doing something along the lines of what this guy did:
http://www.youtube.com/watch?v=-DHR1AwN47s&feature=related
But that doesn't look very efficient. Especially if you don't have someone to shoot the camera for you. I'm thinking about building a foot pedal attachment for my shutter release (I built the shutter release myself).
Any ideas on how to improve upon this design without having to spend any money? Or at least less then 20$
I plan on doing about 10-15 books before I graduate
Punk_Rocker_22
2008-09-15, 19:21
I might make one of these:
http://img382.imageshack.us/img382/7133/attachmentyi6.jpg
Out of cardboard and black spray paint. Both of which I have laying around my dorm
http://img382.imageshack.us/img382/9739/attachmentnq4.jpg
Think it would actually save time? Repositioning the glass looks like a lot of work. And to avoid readjusting the tripod you would first have to shoot all the odd pages and then all the even pages. What is the best way to auto rename a folder full of pictures to 1.jpg, 3.jpg, 5.jpg and another folder to 2.jpg, 4.jpg..?
I tried to do this once for a few chapters of a textbook book.
What I did was I mounted my camera on a 2' piece of 1x4 board which I then attached to my desk drawer with the camera facing the floor. The board was mounted so I could adjust the camera position up or down and I could move it back or forth by moving the drawer.
I then set a book on the floor and set it & the camera so it was right in frame. Then I secured the camera/drawer and marked the floor where the edge of the cover went; then I moved the book so that the opposite page was in view and marked the floor.
I tried doing the pages in order, moving the book each time, but that got old and I ended up just doing the even & odds. I manually organized them on the comp by taking each pic from one set and putting it in the right place in the other, then renamed them all at once.
I had a lot of problems with glare from the flash on the pages and ended up using some lamps to light the pages instead of using the flash. It's probably best to do that anyways so you don't have to run your camera or flash batteries down.
btw, you're not a ME major are you?
Spatula Tzar
2008-09-16, 00:11
It's trivial to write a little script to rename the images. For lighting, I highly suggest using a separate flash, mounted off-camera. The reason for glare is the camera is actually the worst place to mount the flash. Light is reflected at a low angle back into the lens.
You should be able to buy an old strobe for under $20. Otherwise, stick to regular lamps. A high colour temperature compact fluorescent will look best.
Emag's idea sounds good. Set up the shutter release so you can trigger it with little effort, and without changing your grip on the book.
Do the rest of the students a favour and torrent the finished book. Then put hundreds of cards of the torrent link in the bookstore for people to find. Prices are ridiculous...
KwinnieFuckingBogan
2008-09-16, 03:06
What is the best way to auto rename a folder full of pictures to 1.jpg, 3.jpg, 5.jpg and another folder to 2.jpg, 4.jpg..?
Deja Vuis here, and I still don't know a method for interlacing the files like you want to do. I know how you can rename them from DSC1239, DSC 1240, etc into Page 1.jpg, Page 2.jpg, etc* - but you need them to be sorted into the correct order before you do that anyway.
*[I had an issue when I was ripping music onto my HDD then burning it all onto one DVD R, because CD you'd get that awful '01 track 01' titling bullshit, and this of course meant that I had to rename every single track because it couldn't cope with two files with the same name (i.e two tracks off different CD's, both titled '01 track 01'). So I think what I did to get around it was highlight every track in the folder for that CD, and right click on the first one and re-name that 'Ensiferum' etc, so that all of those tracks would then just be called Ensiferum 1.mp3, Ensiferum 2.mp3, etc automatically. ]
Good luck with this, looks extremely time consuming and a bit pointless, but good luck.
Dragonflame
2008-09-17, 15:07
This might work, but I dont know how high of a resoloution it will work with.
http://www.scribd.com/doc/277526/How-to-Copy-a-Whiteboard-With-Your-Digital-Camera-or-Camera-Phone-Using-Scan
ArgonPlasma2000
2008-09-17, 15:45
I've been wanting to dothis as well, but digitizing books with over 1000 pages will get a little shitty. The first problem is automation, and the second is printing.
Automation is probably something we can figure out.
Printing using pictures results in extremely bad compression, even using jpeg. I saw one textbook that was 160MB when it should have been at most around 5 using pdf text compression.
http://www.gnu.org/software/ocrad/ocrad.html
I think that a microcontroller would handle turning the pages and all that. Turning the page could be done with a small vacuum pump to suck the page and then an arm would pull the page over. I think that a light-proof box should be around the entire rig, and a light source inside. For taking the pictures, a webcam should be used as opposed to a regular camera because it can be triggered by the computer directly and it won't have memory corruption issues.
I could probably even do this with Legos, but I don't have any Lego motors.
Punk_Rocker_22
2008-09-17, 16:48
1 - no reason to print. Read them off your computer, PDA, ipod, ect
2 - most DSLRs can work in tethered mode (be triggered by the computer). They also take far higher quality photos and can be manually focused, ect. The DOF on most webcams is shit.
ArgonPlasma2000
2008-09-17, 17:26
1 - no reason to print. Read them off your computer, PDA, ipod, ect
2 - most DSLRs can work in tethered mode (be triggered by the computer). They also take far higher quality photos and can be manually focused, ect. The DOF on most webcams is shit.
Just because you have to print something doesn't mean you print it to paper...
And this thing needs to be cheap if its going to get mass produced. If I had enough money to buy an expensive ass camera I'd have plenty to buy books with.
Punk_Rocker_22
2008-09-17, 19:17
I got a Rebel XT DSLR by Cannon the about 500$
I've already saved more then that on textbooks via pirated ebooks
Plan on saving even more once I start copying books.
Spatula Tzar
2008-09-18, 01:00
Printing using pictures results in extremely bad compression, even using jpeg. I saw one textbook that was 160MB when it should have been at most around 5 using pdf text compression.
But OCR is still terribly inaccurate. It makes me wonder if anyone developed an image compression format designed specifically for glyphs. It would offer very high compression for text, yet still show the original character to prevent misreadings.
ComradeAsh
2008-09-24, 06:29
How are google doing it?
KwinnieFuckingBogan
2008-09-24, 06:37
How are google doing it?
There was a Doco about Google on SBS around 3 to 4 weeks ago. They have an expensive and elaborate machine that takes photos, turns pages, etc. It's in a basement or something similar so that the lighting conditions are consistent. etc.