Posted on: 21st June 2012

Removing Unwanted Characters in Filenames

The problem:

Matt McVicar asked me today how to strip some line noise out of filenames and was having trouble handling it on the command line. The trick with horrible binary noise in your filenames is to put the name in a $variable and then pass this into mv that way its never being asciiafied or escaped/expanded.

The function below will take a glob pattern like '*.txt' and rename everything in the current working directory or paths based on the glob to something with unwanted characters stripped.

The solution:

  1. function clean_names {
  2.    #Character class of what to keep in the filenames
  3.    keep='a-zA-Z0-9_-.;: ()\[\]&#'
  4.    while read filename;
  5.    do
  6.       #Strip all the characters apart from those to keep
  7.       newname=`echo $filename | sed "s/[^$keep]//g"`
  8.       #Move the original file to the new one
  9.       echo "$filename" "$newname"
  10.       mv "$filename" "$newname";
  11.    done < <(ls -1 $1)
  12. }


    clean_names '*.txt'

