Audiobook Factory
The server at the moment is Ip-172-26-13-40 providing conversion for upto ~50KB of text at a time.
This is a quick hack to convert common ebook, image, and text formats into audio books. Using open source OCR and voice synthesis software. Please don't abuse or overload this test service, a giant 100+ page djvu file will destroy the server unless you specify a subset of a few pages covering a chapter.
Please be patient after pressing "Download" as it takes time to process the input and prepare a file for you. Please do not press the "Download" button repeatedly after submitting a job, this will not make anything go faster, and is likely to bring the server to a grinding halt!
A single line of text translates to about a 15-20KiB MP3 file, but the intermediate wave form held in memory is much bigger, so you can imagine massive amounts of text make the server cry. Someone recently uploaded a 50KiB Asimov book in plain text, which produced a 5MiB (and 25 minute long!) MP3, this is the rough upper-limit of what I want to see being uploaded. If anyone wants to offer me some uber-hosting I would be very grateful ;D
Current features include:
- Various input formats and sources supported: .doc .rtf .html .pdf .djvu .ps .txt and most kinds of images.
- Various encoded audio outputs.
- Use of completely open source technology: ImageMagick, mencoder, Lame, OGGEnc, GOCR, DjVu-libre, xpdf, Festival, Antiword and many other command-line GNU/Linux tools.
Future features might include:
- Dowload plain text file generated by backend processing as an option in the output format instead of audio.
- Support for more input formats including OpenOffice.org documents and epubs.
- Batch ordering and processing system to let you thrash out whatever volume of data you want!
- Multi-language support, problematic with current OCR & text-speech technology.
- Automatic language detection and machine translation of source text before audio is generated.
- Ability to choose the sex and nationality of the speakers voice, and tweak audio properties.