On handy docker images

Motivated by successful stripping problematic dependencies from Redundans, I have decided to generate smaller Docker image, starting with Alpine Linux image (2Mb / 5Mb after downloading) instead of Ubuntu (49Mb / 122Mb). Previously, I couldn’t really rely on Alpine Linux, because it was impossible to make these problematic dependencies running… But now it’s whole new world of possibilities 😉

There are very few dependencies left, so I have started… (You can find all the commands below).

  1. First, I have check what can be installed from package manager.
    Only Python and Perl.

  2. Then I have checked if any of binaries are working.
    For example, GapCloser is provided as binary. It took me some time to find source code…
    Anyway, none of the binaries worked out of the box. It was expected, as Alpine Linux is super stripped…

  3. I have installed build-base in order to be able to build things.
    Additionally, BWA need zlib-dev.

  4. Alpine Linux doesn’t use standard glibc library, but musl-libc (you can read more about differences between the two), so some programmes (ie. BWA) may be quite reluctant to compile.
    After some hours of trying & thanks to the help of mp15, I have found a solution, not so complicated 🙂

  5. I have realised, that Dockerfile doesn’t like standard BASH brace expansion, that is working otherwise in Docker Alpine console…
    so ls *.{c,h} should be ls *.c *.h

  6. After that, LAST and GapCloser compilation were easy, relatively 😉

Below, you can find the code from Docker file (without RUN commands).

apk add --update --no-cache python perl bash wget build-base zlib-dev
mkdir -p /root/src && cd /root/src && wget http://downloads.sourceforge.net/project/bio-bwa/bwa-0.7.15.tar.bz2 && tar xpfj bwa-0.7.15.tar.bz2 && ln -s bwa-0.7.15 bwa && cd bwa && \
cp kthread.c kthread.c.org && echo "#include <stdint.h>" > kthread.c && cat kthread.c.org >> kthread.c && \
sed -ibak 's/u_int32_t/uint32_t/g' `grep -l u_int32_t *.c *.h` && make && cp bwa /bin/ && \
cd /root/src && wget http://liquidtelecom.dl.sourceforge.net/project/soapdenovo2/GapCloser/src/r6/GapCloser-src-v1.12-r6.tgz && tar xpfz GapCloser-src-v1.12-r6.tgz && ln -s v1.12-r6/ GapCloser && cd GapCloser && make && cp bin/GapCloser /bin/ && \
cd /root/src && wget http://last.cbrc.jp/last-744.zip && unzip last-744.zip && ln -s last-744 last && cd last && make && make install && \
cd /root/src && rm -r last* bwa* GapCloser* v* 

# SSPACE && redundans in /root/srt
cd /root/src && wget -q http://www.baseclear.com/base/download/41SSPACE-STANDARD-3.0_linux-x86_64.tar.gz && tar xpfz 41SSPACE-STANDARD-3.0_linux-x86_64.tar.gz && ln -s SSPACE-STANDARD-3.0_linux-x86_64 SSPACE && wget -O- -q http://cpansearch.perl.org/src/GBARR/perl5.005_03/lib/getopts.pl > SSPACE/dotlib/getopts.pl && \
wget --no-check-certificate -q -O redundans.tgz https://github.com/lpryszcz/redundans/archive/master.tar.gz && tar xpfz redundans.tgz && mv redundans-master redundans && ln -s /root/src/redundans /redundans && rm *gz

apk del wget build-base zlib-dev 
apk add libstdc++

After building & pushing, I have noticed that Alpine-based image is slightly smaller (99Mb), than the one based on Ubuntu (127Mb). Surprisingly, Alpine-based image is larger (273Mb) than Ubuntu-based (244Mb) after downloading. So, I’m afraid all of these hours didn’t really bring any substantial reduction in the image size.

Conclusion?
I was very motivated to build my application on Alpine Linux and expected substantial size reduction. But I’d say that relying on Alpine Linux image doesn’t always pay off in terms of smaller image size, forget about production time… And this I know from my own experience.
But maybe I didn’t something wrong? I’d be really glad for some advices/comments!

Nevertheless, stripping a few dependencies from my application (namely Biopython, numpy & scipy), resulted in much more compact image even using Ubuntu-based image (127Mb vs 191Mb; and 244Mb vs 440Mb after downloading). So I think this is the way to go 🙂