Batch convert of .xlsx (Microsoft Office) to .tsv (tab-delimited) files

I had to retrieve data from multiple .xlsx files with multiple sheets. This can be done manually, but it will be rather time-consuming tasks, plus Office quotes text fields, which is not very convenient for downstream analysis…
I have found handy script, xlsx2tsv.py, that does the job, but it reports only one sheet at the time. Thus, I have rewritten xlsx2tsv.py a little to save all sheets from given .xlsx file into separate folder. In addition, multiple .xlsx files can be process at once. My version can be found on github.

xlsx2tsv.py *.xlsx

Batch conversion of images using ImageMagic

Today I needed to convert multiple .pdf files into .tiff images with specific DPI and LZW compression. I found it’s very simple using ImageMagic.

# install
sudo apt-get install imagemagick

# convert .pdf to lzw compressed .tiff changing dpi to 300
mkdir tiffs
for f in *.pdf; do 
  echo `date` $f; 
  convert -density 300 -compress lzw $f tiffs/$f.tiff; 
done
date

For more options, have a look at ImageMagic site.