pdfistxt

A useful script if you wish to programatically manipulate PDF content.

The following script uses the pdffonts and pdftotext commands to find out if there is any text content in a PDF. This is used in my audiobook creation pipeline to see if a PDF requires OCR or not. In Ubuntu you will need to install the packages `xpdf` and `xpdf-utils` to get this to work. I'm sure other distros have xpdf packages too >:3

The information provided on this and other pages by me, Matt Oates (mattoates@gmail.com), is under my own personal responsibility and not that of Aberystwyth University or the University of Bristol. Similarly, any opinions expressed are my own and are in no way to be taken as those of either institute.