Recognise & remove corrupted jpeg files

Shotwell started to crash on my computer. I have realised it’s due to corrupted JPEG files in my picture directory. I have started to remove corrupted files by hand, based on entries from shotwell log.cache/shotwell/shotwell.log. However after removal of several files, I have noticed corrupted JPEG are spread all over my photos. Luckily, there is simple way of detecting corrupted jpeg files using Python Imaging Library (PIL).

  1. First, clear shotwell cache
  2. Note, this will remove all you libraries from shotwell!

    rm -r ./.cache/shotwell ./.local/share/shotwell
  3. Install dependencies
  4. sudo easy_install -U PIL Send2Trash
  5. And run python
  6. Below code will move all corrupted .jpg and .jpeg files to system trash.

      import os, sys
      from PIL import Image
      from send2trash import send2trash
       
      def is_jpg(filename):
        try:
          i=Image.open(filename)
          return i.format =='JPEG'
        except IOError:
          return False
       
      cdir = "."
      if len(sys.argv)>1:
        cdir = sys.argv[1]
      i=j=k=0
      for root, dirs, fnames in os.walk(cdir):
        sys.stderr.write(" %s\n"% root)
        j += 1
        dirs.sort()
        jpegs = filter(lambda fn: fn.lower().endswith((".jpg",".jpeg")), fnames)
        for fn in jpegs:
          fpath = os.path.join(root, fn)
          if not is_jpg(fpath):
            sys.stderr.write("  %s\n"%fn)
            send2trash(fpath)
            k += 1
        i += len(jpegs)
       
      print "%s files scanned in %s dirs. %s files removed."%(i, j, k)