GPU basecalling with MinION ver. 21.06.0

Below I’m only describing the changes relative to the previous tutorial.

Firstly, the location of guppy binaries changed. So in order to know the version, you’ll need to execute

/opt/ont/guppy/bin/guppy_basecall_server -v # in my case it’s ver. 5.0.11

Secondly, it seems ONT started to distribute guppy as a service starting from MinION 21.06.0. Because of that, changing MinION configuration has no effect on the service itself.

So in order to enable GPU basecalling, you’ll need to edit the guppyd.service in /etc/systemd/system/guppyd.service.d/override.conf to something like this

ExecStart=/opt/ont/guppy/bin/guppy_basecall_server --log_path /var/log/guppy --config dna_r9.4.1_450bps_fast.cfg --port 5555 --device cuda:all

And naturally reload and restart all relevant services:

sudo systemctl daemon-reload
sudo service guppyd stop && sudo service minknow stop && sudo service minknow start

GPU basecalling with MinION

While ago I’ve been strugglin with enabling GPU live basecalling in MinKNOW on non-GridION systems. Naturally, ONT wasn’t providing easy way to use GPU in your custom machine, otherwise there wouldn’t be much motivation to buy GridION, right? Still, it turns out you can enable live GPU basecalling in MinKNOW given you have GPU with CUDA-support in your computer. Below I’ll describe briefly what needs to be done. I’m assuming you have MinKNOW and GPU with CUDA support already installed.

First of all, make sure you have CUDA version 6+ correctly installed in your system (instruction to install CUDA are here).

nvidia-smi

If you see something like the image below, you are ready to go 🙂

Now you’ll need to get guppy binaries with CUDA support as those provided with MinKNOW have no GPU support. You can get them from ONT website. Note, guppy major and minor version has to match to the version currently being used in MinKNOW. You can check this version using:

/opt/ont/minknow/guppy/bin/guppy_basecall_server -v

So, I can install guppy v4.0.x (I’ve chose v4.0.15) with CUDA support using (note, you may need to adjust version in below commands depending on what you get from the previous command):

mkdir -p ~/src; cd ~/src
# you may need to change the guppy version
wget https://mirror.oxfordnanoportal.com/software/analysis/ont-guppy_4.0.15_linux64.tar.gz
tar xpfz ont-guppy_4.0.15_linux64.tar.gz
mv ont-guppy ont-guppy_4.0.15

Now just link you guppy binaries inside /opt/ont/minknow (again, you may need to adjust guppy version here)

cd /opt/ont/minknow
sudo mv guppy guppy0
# you may need to change the guppy version
sudo ln -s ~/src/ont-guppy_4.0.15 guppy

Then edit /opt/ont/minknow/conf/app_conf (use sudo!) and change line with gpu_calling to true and also num_threads and ipc_threads to 3 and 2, respectively (you can also define which GPUs you want to enable – by default all available cuda devices will be used):

    "gpu_calling": true,  
    "gpu_devices": "cuda:all",
    ...
    "num_threads": 3,
    "ipc_threads": 2, 

Finally close MinKNOW client (if any is running) and restart MinKNOW system service:

sudo service minknow stop && sudo killall guppy_basecall_server && sudo service minknow start

Now you should see guppy using GPU (-x cuda:all) and you GPU will be used if you run sequencing with live basecalling. Note, you can monitor your gpu usage using gpustat or glances.

ps ax | grep guppy_basecall_server

Voila!

Ubuntu with Gnome extensions for productivity

Time ago I’ve written about KDE configuration. I’ve been using KDE for a while in my personal laptop, but never really got into using KDE in my workstation. Simply I find Gnome much more productive environment in the long-term. (Disclaimer: likely I’m very biased here since I’ve been using Gnome-like desktops for over 10 years now and those seem natural to me. Still I think KDE is fantastic.)

Gnome3 came with extensions. Those are really cool, but be aware that some extensions may brake from release-to-release. Also some may have certain incompatibilities. Below I’m describing briefly which extensions I’m using currently and why (those should be working fine with both, Ubuntu 18.04 and 20.04):

  • Workspace Matrix will arrange your virtual desktops into 2D grid and you can switch easily between rows/columns with Ctrl+Alt+any arrow.
  • Unite (No Title Bar – Forked and PixelSaver have problems with Ubuntu 22.04) will get rid of window titlebar. That’s very useful for small laptop screens, but I’m also using it with dual monitor at work as titlebar is just waste of space…
  • system-monitor will show details about system usage (CPU, RAM, I/O) right in your system tray.
  • Bing Wallpaper Changer gets really good wallpapers daily. You can read a bit more about each picture here – it’s really great resource if you’re looking for some want-to-go places in your vicinity!

On top of that, definitely try:

  • guake is a drop-down terminal. It’s super useful if you need to get quick access to the terminal across multiple desktops. Funny enough I’ve felt in love with yakuake in KDE first and learnt Gnome version is closer to my ideals 😛
  • glances (I’ve discussed GPU support earlier) or htop for process viewing
  • screen (or even better tmux) for terminal multiplexing. This comes handy especially if you work remotely a lot.
  • workrave will remind you to have a break once in a while. Try it, it’s really healthy!

If you find installation/updates of packages slow, definitely check apt-fast

sudo add-apt-repository ppa:apt-fast/stable 
sudo apt-get update 
sudo apt-get -y install apt-fast

Finally I’d recommend to isolate windows from individual workspaces, both for the dock and app-switcher (this will show only windows from current desktop in the dock and when Alt+TAB / ` are pressed):

gsettings set org.gnome.shell.extensions.dash-to-dock isolate-workspaces true
gsettings set org.gnome.shell.app-switcher current-workspace-only true

Above we’ll result in desktop similar to this

Do you have any recommendations or Gnome-related tricks?

Edits:

In order to get rid of terminal title-bar, follow this.

To get extension to work with newer versions of Gnome, edit config file as explained here.

Monitoring GPU usage

If you (like me) happen to be the performance freak, most likely you are well aware of process viewers like htop. Since I’ve started working with GPU-computing I missed htop-like tool tailored to monitor GPU usage. This is becoming more of an issue if you’re working in multi-GPU setups.

You can use `nvidia-smi` which is shipped with NVIDIA drivers, but it’s not very interactive.

gpustat provide nice and interactive view of the processes running and resources used across your GPUs, but you’ll need to switch between windows if you want to also monitor CPU usage.

pip install -U gpustat
gpustat -i

Some time ago I’ve discovered glances – really powerful htop replacement. What’s best about glances (at least for me) is that beside I/O and information from sensors, you can see GPU usage. This is done thanks to py3nvml.

pip install -U glances py3nvml
glances

At first glances window may look a bit overwhelming, but after a few uses you’ll likely fell in love with it!

And what’s your favorite GPU process viewer?

Python code profiling and accelerating your calculations with numba

You wrote up your excellent idea as Python program/module but you are unsatisfied with its performance. The chances are high most of us have been there at least once. I’ve been there last week.

I found excellent method for outlier detection (Enhanced Isolation Forest). eIF was initially written in Python and later optimised in Cython (using C++). C++ is ~40x faster than vanilla Python version, but it lacks the possibility to save the model (which is crucial for my project). Since adding model saving to C++ version is rather complicated buisness, I’ve decided to optimise Python code. Initially I hoped for ~5-10x speed improvement. The final effect surprised me, as rewritten Python code was ~40x faster than initial version matching C++ version performance!

How is it possible? Speeding up your code isn’t trivial. First you need to find which parts of your code are slow (so-called code profiling). Once you know that, you can start tinkering with the code itself (code optimisation).

Code profiling

Traditionally I’ve been relying on %timeit which reports precise execution time for expressions in Python.

%timeit F3.fit(X)
# 1.25 s ± 792 µs per loop (mean ± std. dev. of 7 runs, 1 l oop each)

As awesome as %timeit is, it won’t really tell you which parts of your code are time consuming. At least not directly. For that you’ll need something more advanced.

Code profiling became easier thanks to line_profiler. You can install, load and use it in Jupyter notebook as follows:

# install line_profiler in your system
!pip install line_profiler 
# load the module into current Jupyter notebook
%load_ext line_profiler

# evaluate populate_nodes function of F3.fit program
%lprun -f F3.populate_nodes F3.fit(X)

The example above tells you that although line 134 takes just 11.7 µs per single execution, overall it takes 42.5% of the execution time as it’s executed over 32k times. So starting optimisation of the code from this single line could have dramatic effect on overall execution time.

Code optimisation

First thing I’ve noticed in the original Python code was that in order to calculate outlier score individual samples were streamed through individual trees in the iForest.

        for i in  range(len(X_in)):
            h_temp = 0
            for j in range(self.ntrees):
                h_temp += PathFactor(X_in[i],self.Trees[j]).path*1.0            # Compute path length for each point
            Eh = h_temp/self.ntrees                                             # Average of path length travelled by the point in all trees.
            S[i] = 2.0**(-Eh/self.c)                                            # Anomaly Score
        return S

Since those are operations on arrays, lots of time can be saved if either all samples are processed by individual trees or if individual samples are processed by all trees. Implementing this wasn’t difficult and, combined with cleaning the code from unnecessary variables & classes, resulted in ~6-7x speed-up.

Speeding array operations with numba

Further improvements were much more mild and required detailed code profiling. As mentioned above, single line took 42% overall execution time. Upon closer inspection, I’ve realised that calling X.min(axis=0) and X.max(axis=0) was really time consuming.

x = np.random.random(size=(256, 12))
%timeit x.min(axis=0), x.max(axis=0)
# 15.6 µs ± 43.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Python code can be optimised with numba. For example calculating min and max simultaneously using numba just-in-time compiler results in over 7x faster execution!

from numba import jit

@jit
def minmax(x):
    """np.min(x, axis=0), np.max(x, axis=0) for 2D array but faster"""
    m, n = len(x), len(x[0])
    mi, ma = np.empty(n), np.empty(n)
    mi[:] = ma[:] = x[0]
    for i in range(1, m):
        for j in range(n):
            if x[i, j]>ma[j]: ma[j] = x[i, j]
            elif x[i, j]<mi[j]: mi[j] = x[i, j]
    return mi, ma

%timeit minmax(x) 
# 2.19 µs ± 4.61 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# make sure the results are the same
np.all([minmax(x), (x.min(axis=0), x.max(axis=0))], axis=0)

Apart from that, there have been several other parts that could be optimised with numba. You can have a look at eif_new.py and compare it with older and C++ version using this notebook. If you want to know details, just comment below – I’ll be more than happy to discuss them 🙂

If you’re looking for ways of speeding up array operations, definitely check numexpr beside numba. eIF case didn’t really need numexpr optimisations, but it’s really impressive project and I can imagine many people could benefit from it. So spread the word!

How to setup NVIDIA / CUDA for accelerated computation in Ubuntu 18.04 / Gnome / X11 workstation?

I’ve experienced a bit of difficulties when I’ve tried to enable CUDA in my workstation. Those were mostly related to system lags while I’ve been performing CUDA computations. That was because Gnome/Xserver were using NVIDIA card. I’ve realised you’d be much better of using your discrete graphic card for the system and leaving NVIDIA GPU only for serious tasks 🙂 Note, this will disable NVIDIA GPU for GNOME / X11 and also for gaming, so be aware…

Below I’ll describe briefly how I’ve installed NVIDIA drivers and configured Ubuntu 18.04 with Gnome3 and Xserver for comfortable CUDA computations.

The best if you install CUDA toolking and drivers before you plug the card, as just plugging the card may cause issues with running Ubuntu otherwise (it did in my case). In order to install NVIDIA drivers, just follow official Nvidia guide. 

Then after reboot plug the card to your computer and in the BIOS select integrated card as your main card. In my BIOS it was under Advanced > Built-in Device Options > Select Boot card > CPU integrated or Nvidia GPU.

If you experience any problems, uncomment WaylandEnable=false in /etc/gdm3/custom.conf to use X11 for GDM and Gnome. Don’t do that, if you plan to use Wayland!

Now make sure you have Nvidia plugged in and working.

# show available graphic cards
lspci -k | grep -A 2 -i "VGA"

If you installed the drivers from NVIDIA website, you may need to restore java

sudo rm /etc/alternatives/java
jpath=/opt/java/jre1.8.0_211/bin
sudo ln -s $jpath/java /etc/alternatives/java

Make sure to switch to integrated graphics card using either

  • nvidia-settings  > PRIME Profiles and select Intel (Power Saving Mode) (this should work for both, X11 and Wayland)
  • or by editing /etc/X11/xorg.conf to something like that (if you use Wayland, this won’t work!)
Section "Device"
         Identifier "Intel"
         Driver "intel"
 Option "AccelMethod" "uxa"
 EndSection

Reboot your system and make sure Gnome isn’t using NVIDIA GPU (there should be no processes running on your GPU after reboot).

# check processed running on GPU
nvidia-smi

Now, when you run any CUDA computation, your system shouldn’t be affected by high NVIDIA GPU usage.

How to install Java and JavaWS in Linux

Last night I’ve learnt Oracle changed the Java licensing which broke all channels of installation typically used in Linux.

Still, you can obtain & install Java and JavaWS free-of-charge manually for personal and developmental use. This is how to proceed:

cd ~/Download
tar xpfz jre-8u211-linux-x64.tar.gz
  • Move folder to `/opt/java`
sudo mkdir -p /opt/java
sudo mv jre1.8.0_211 /opt/java
  • Remove or rename symbolic links if they exists
which java
which javaws
ls -la /usr/bin/java*
sudo rm /usr/bin/java*
  • Link new java version
sudo ln -s /opt/java/jre1.8.0_211/bin/java /usr/bin
sudo ln -s /opt/java/jre1.8.0_211/bin/javaws /usr/bin

Now everything should work just fine 🙂

Migrate google cloud VM to new account / project

For couple of weeks, I’ve been looking for an easy way of migrating virtual machine from one Google Cloud Platform (GCP) account to another. At first, I wanted to follow an old Medium post, but I’ve found it rather complicated. Therefore, I’ve decided to tinker myself. It turns out you can easily transfer VM images between projects/accounts in three simple steps thanks to Create imagefeature as follows:

Hope someone will find it useful!

Moving to KDE – is it worth it?

I’m Ubuntu enthusiast. However, since Gnome introduction as default in Ubuntu, I’ve been experiencing stability issues. I don’t mind to reboot my laptop from time to time, but my workstation is a different story – often many weeks without reboot.

After many discussions with my friend, I’ve decided to give a try to KDE. I’ve been experimenting with KDE years ago and I found it not straightforward to use. But apparently since version 5 it’s possible to customise KDE to look & feel nearly whatever you like. And I have to admit, I got sucked by it after just a few hours. First of all, it’s very stable, quite lightweight and very practical. It’s also pretty – it doesn’t matter that much for productivity, but it’s nice add-on. I felt in love with drop-down terminal. Setting everything so migration from Gnome was smooth took me a few hours for the first time. But it paid off rather quickly, cause I’m way more productive than before. That’s how my screen looks like more or less.

If you want to try it, I’d recommend trying KDE Neon instead of Kubuntu, as Neon is developed by KDE Community, therefore it’s the purest KDE experience you can get. Below, you can find a list a widgets, applications and customisations which made my life easier (again, big thanks to Maciek for helping with the migration!).

Widgets:

  • (Add widgets)
    • system load viewer [set compact view]
    • Global menu
  • (Add widgets > Get new widgets > Download new plasma widgets)
    • event calendar (replace standard clock & calendar)
  • from github
    • https://github.com/jsalatas/plasma-pstate

Applications (installed through Discover)

  • thermal monitor
  • redshift + redshift control
  • latte
  • dropbox
  • netspeed widget

Terminal

  • yakuake (drop-down terminal activated with F12)
  • workrave
sudo apt install yakuake workrave htop
yakuake &
latte-dock &

Tweaks

I’ll try to keep it up-to-date. Hope someone will find it useful. I did already, while installing Neon on the second and third machine 😉