YouTube Video Feature Indexing

This page is the result of slapping together mplayer, opencv and youtube-dl

/!\Page under construction. Functionality is dependent on page completion, and web hosting status.  [↓]

Not available on this host:

  • java - Executes markov parody text generation.
  • festival - Synthesizes machine spoken waveform from text input.
  • gocr - Converts raster text data into vectorized plain-text (OCR) when creating audiobooks.
  • html2txt - Converts HTML to formatted ASCII text.
  • djvu-libre - Manipulate djvu file input so that we can extract data for OCR or direct voice synthesis.
  • pdf-tools - Manipulate PDF file input so that we can extract data for OCR or direct voice synthesis.
  • pdfistext - Test if a PDF only contains scanned images or has text content.
  • imagemagick - Converts supported image formats into an intermediate format (PNM) used in OCR.
  • pstotext - Converts PostScript files into plain ASCII text as if they were being printed.
  • antiword - Converts older Word (<=2003) .doc files to plain text.
  • lame - Encodes synthesized voice waveform to lame mp3.
  • mencoder - Encodes from one mplayer-supported video format to another.
  • oggenc - Encodes synthesized voice waveform to ogg vorbis.
  • unrtf - Converts RTF files to ASCII text.
  • youtube-dl - Downloads a .flv Youtube flash movie, described by a given URL.

Currently available on this host:

  • perl - Scrapes website/wikipedia online sources when creating audiobooks.
  • host - I

Current features include:

  1. Nothing yet. Will be in construction for a while since an ordering and batch processing system needs to be coded up!

Future features might include:

The information provided on this and other pages by me, Matt Oates (, is under my own personal responsibility and not that of Aberystwyth University or the University of Bristol. Similarly, any opinions expressed are my own and are in no way to be taken as those of either institute.