Ubuntu with Gnome extensions for productivity

Time ago I’ve written about KDE configuration. I’ve been using KDE for a while in my personal laptop, but never really got into using KDE in my workstation. Simply I find Gnome much more productive environment in the long-term. (Disclaimer: likely I’m very biased here since I’ve been using Gnome-like desktops for over 10 years now and those seem natural to me. Still I think KDE is fantastic.)

Gnome3 came with extensions. Those are really cool, but be aware that some extensions may brake from release-to-release. Also some may have certain incompatibilities. Below I’m describing briefly which extensions I’m using currently and why (those should be working fine with both, Ubuntu 18.04 and 20.04):

  • Workspace Matrix will arrange your virtual desktops into 2D grid and you can switch easily between rows/columns with Ctrl+Alt+any arrow.
  • PixelSaver (or even better No Title Bar – Forked for Wayland or multi-display configurations) will get rid of window titlebar. That’s very useful for small laptop screens, but I’m also using it with dual monitor at work as titlebar is just waste of space…
  • system-monitor will show details about system usage (CPU, RAM, I/O) right in your system tray.
  • Bing Wallpaper Changer gets really good wallpapers daily. You can read a bit more about each picture here – it’s really great resource if you’re looking for some want-to-go places in your vicinity!

On top of that, definitely try:

  • guake is a drop-down terminal. It’s super useful if you need to get quick access to the terminal across multiple desktops. Funny enough I’ve felt in love with yakuake in KDE first and learnt Gnome version is closer to my ideals 😛
  • glances (I’ve discussed GPU support earlier) or htop for process viewing
  • screen (or even better tmux) for terminal multiplexing. This comes handy especially if you work remotely a lot.
  • workrave will remind you to have a break once in a while. Try it, it’s really healthy!

If you find installation/updates of packages slow, definitely check apt-fast

sudo add-apt-repository ppa:apt-fast/stable 
sudo apt-get update 
sudo apt-get -y install apt-fast

Finally I’d recommend to isolate windows from individual workspaces, both for the dock and app-switcher (this will show only windows from current desktop in the dock and when Alt+TAB / ` are pressed):

gsettings set org.gnome.shell.extensions.dash-to-dock isolate-workspaces true
gsettings set org.gnome.shell.app-switcher current-workspace-only true

Above we’ll result in desktop similar to this

Do you have any recommendations or Gnome-related tricks?

Monitoring GPU usage

If you (like me) happen to be the performance freak, most likely you are well aware of process viewers like htop. Since I’ve started working with GPU-computing I missed htop-like tool tailored to monitor GPU usage. This is becoming more of an issue if you’re working in multi-GPU setups.

You can use `nvidia-smi` which is shipped with NVIDIA drivers, but it’s not very interactive.

gpustat provide nice and interactive view of the processes running and resources used across your GPUs, but you’ll need to switch between windows if you want to also monitor CPU usage.

pip install -U gpustat
gpustat -i

Some time ago I’ve discovered glances – really powerful htop replacement. What’s best about glances (at least for me) is that beside I/O and information from sensors, you can see GPU usage. This is done thanks to py3nvml.

pip install -U glances py3nvml
glances

At first glances window may look a bit overwhelming, but after a few uses you’ll likely fell in love with it!

And what’s your favorite GPU process viewer?

Python code profiling and accelerating your calculations with numba

You wrote up your excellent idea as Python program/module but you are unsatisfied with its performance. The chances are high most of us have been there at least once. I’ve been there last week.

I found excellent method for outlier detection (Enhanced Isolation Forest). eIF was initially written in Python and later optimised in Cython (using C++). C++ is ~40x faster than vanilla Python version, but it lacks the possibility to save the model (which is crucial for my project). Since adding model saving to C++ version is rather complicated buisness, I’ve decided to optimise Python code. Initially I hoped for ~5-10x speed improvement. The final effect surprised me, as rewritten Python code was ~40x faster than initial version matching C++ version performance!

How is it possible? Speeding up your code isn’t trivial. First you need to find which parts of your code are slow (so-called code profiling). Once you know that, you can start tinkering with the code itself (code optimisation).

Code profiling

Traditionally I’ve been relying on %timeit which reports precise execution time for expressions in Python.

%timeit F3.fit(X)
# 1.25 s ± 792 µs per loop (mean ± std. dev. of 7 runs, 1 l oop each)

As awesome as %timeit is, it won’t really tell you which parts of your code are time consuming. At least not directly. For that you’ll need something more advanced.

Code profiling became easier thanks to line_profiler. You can install, load and use it in Jupyter notebook as follows:

# install line_profiler in your system
!pip install line_profiler 
# load the module into current Jupyter notebook
%load_ext line_profiler

# evaluate populate_nodes function of F3.fit program
%lprun -f F3.populate_nodes F3.fit(X)

The example above tells you that although line 134 takes just 11.7 µs per single execution, overall it takes 42.5% of the execution time as it’s executed over 32k times. So starting optimisation of the code from this single line could have dramatic effect on overall execution time.

Code optimisation

First thing I’ve noticed in the original Python code was that in order to calculate outlier score individual samples were streamed through individual trees in the iForest.

        for i in  range(len(X_in)):
            h_temp = 0
            for j in range(self.ntrees):
                h_temp += PathFactor(X_in[i],self.Trees[j]).path*1.0            # Compute path length for each point
            Eh = h_temp/self.ntrees                                             # Average of path length travelled by the point in all trees.
            S[i] = 2.0**(-Eh/self.c)                                            # Anomaly Score
        return S

Since those are operations on arrays, lots of time can be saved if either all samples are processed by individual trees or if individual samples are processed by all trees. Implementing this wasn’t difficult and, combined with cleaning the code from unnecessary variables & classes, resulted in ~6-7x speed-up.

Speeding array operations with numba

Further improvements were much more mild and required detailed code profiling. As mentioned above, single line took 42% overall execution time. Upon closer inspection, I’ve realised that calling X.min(axis=0) and X.max(axis=0) was really time consuming.

x = np.random.random(size=(256, 12))
%timeit x.min(axis=0), x.max(axis=0)
# 15.6 µs ± 43.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Python code can be optimised with numba. For example calculating min and max simultaneously using numba just-in-time compiler results in over 7x faster execution!

from numba import jit

@jit
def minmax(x):
    """np.min(x, axis=0), np.max(x, axis=0) for 2D array but faster"""
    m, n = len(x), len(x[0])
    mi, ma = np.empty(n), np.empty(n)
    mi[:] = ma[:] = x[0]
    for i in range(1, m):
        for j in range(n):
            if x[i, j]>ma[j]: ma[j] = x[i, j]
            elif x[i, j]<mi[j]: mi[j] = x[i, j]
    return mi, ma

%timeit minmax(x) 
# 2.19 µs ± 4.61 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# make sure the results are the same
np.all([minmax(x), (x.min(axis=0), x.max(axis=0))], axis=0)

Apart from that, there have been several other parts that could be optimised with numba. You can have a look at eif_new.py and compare it with older and C++ version using this notebook. If you want to know details, just comment below – I’ll be more than happy to discuss them 🙂

If you’re looking for ways of speeding up array operations, definitely check numexpr beside numba. eIF case didn’t really need numexpr optimisations, but it’s really impressive project and I can imagine many people could benefit from it. So spread the word!

How to setup NVIDIA / CUDA for accelerated computation in Ubuntu 18.04 / Gnome / X11 workstation?

I’ve experienced a bit of difficulties when I’ve tried to enable CUDA in my workstation. Those were mostly related to system lags while I’ve been performing CUDA computations. That was because Gnome/Xserver were using NVIDIA card. I’ve realised you’d be much better of using your discrete graphic card for the system and leaving NVIDIA GPU only for serious tasks 🙂 Note, this will disable NVIDIA GPU for GNOME / X11 and also for gaming, so be aware…

Below I’ll describe briefly how I’ve installed NVIDIA drivers and configured Ubuntu 18.04 with Gnome3 and Xserver for comfortable CUDA computations.

The best if you install CUDA toolking and drivers before you plug the card, as just plugging the card may cause issues with running Ubuntu otherwise (it did in my case). In order to install NVIDIA drivers, just follow official Nvidia guide

Then after reboot plug the card to your computer and in the BIOS select integrated card as your main card. In my BIOS it was under Advanced > Built-in Device Options > Select Boot card > CPU integrated or Nvidia GPU.

If you experience any problems, uncomment WaylandEnable=false in /etc/gdm3/custom.conf to use X11 for GDM and Gnome. Don’t do that, if you plan to use Wayland!

Now make sure you have Nvidia plugged in and working.

# show available graphic cards
lspci -k | grep -A 2 -i "VGA"

If you installed the drivers from NVIDIA website, you may need to restore java

sudo rm /etc/alternatives/java
jpath=/opt/java/jre1.8.0_211/bin
sudo ln -s $jpath/java /etc/alternatives/java

Make sure to switch to integrated graphics card using either

  • nvidia-settings  > PRIME Profiles and select Intel (Power Saving Mode) (this should work for both, X11 and Wayland)
  • or by editing /etc/X11/xorg.conf to something like that (if you use Wayland, this won’t work!)
Section "Device"
         Identifier "Intel"
         Driver "intel"
 Option "AccelMethod" "uxa"
 EndSection

Reboot your system and make sure Gnome isn’t using NVIDIA GPU (there should be no processes running on your GPU after reboot).

# check processed running on GPU
nvidia-smi

Now, when you run any CUDA computation, your system shouldn’t be affected by high NVIDIA GPU usage.

How to install Java and JavaWS in Linux

Last night I’ve learnt Oracle changed the Java licensing which broke all channels of installation typically used in Linux.

Still, you can obtain & install Java and JavaWS free-of-charge manually for personal and developmental use. This is how to proceed:

cd ~/Download
tar xpfz jre-8u211-linux-x64.tar.gz
  • Move folder to `/opt/java`
sudo mkdir -p /opt/java
sudo mv jre1.8.0_211 /opt/java
  • Remove or rename symbolic links if they exists
which java
which javaws
ls -la /usr/bin/java*
sudo rm /usr/bin/java*
  • Link new java version
sudo ln -s /opt/java/jre1.8.0_211/bin/java /usr/bin
sudo ln -s /opt/java/jre1.8.0_211/bin/javaws /usr/bin

Now everything should work just fine 🙂

Migrate google cloud VM to new account / project

For couple of weeks, I’ve been looking for an easy way of migrating virtual machine from one Google Cloud Platform (GCP) account to another. At first, I wanted to follow an old Medium post, but I’ve found it rather complicated. Therefore, I’ve decided to tinker myself. It turns out you can easily transfer VM images between projects/accounts in three simple steps thanks to Create imagefeature as follows:

Hope someone will find it useful!

Moving to KDE – is it worth it?

I’m Ubuntu enthusiast. However, since Gnome introduction as default in Ubuntu, I’ve been experiencing stability issues. I don’t mind to reboot my laptop from time to time, but my workstation is a different story – often many weeks without reboot.

After many discussions with my friend, I’ve decided to give a try to KDE. I’ve been experimenting with KDE years ago and I found it not straightforward to use. But apparently since version 5 it’s possible to customise KDE to look & feel nearly whatever you like. And I have to admit, I got sucked by it after just a few hours. First of all, it’s very stable, quite lightweight and very practical. It’s also pretty – it doesn’t matter that much for productivity, but it’s nice add-on. I felt in love with drop-down terminal. Setting everything so migration from Gnome was smooth took me a few hours for the first time. But it paid off rather quickly, cause I’m way more productive than before. That’s how my screen looks like more or less.

If you want to try it, I’d recommend trying KDE Neon instead of Kubuntu, as Neon is developed by KDE Community, therefore it’s the purest KDE experience you can get. Below, you can find a list a widgets, applications and customisations which made my life easier (again, big thanks to Maciek for helping with the migration!).

Widgets:

  • (Add widgets)
    • system load viewer [set compact view]
    • Global menu
  • (Add widgets > Get new widgets > Download new plasma widgets)
    • event calendar (replace standard clock & calendar)
  • from github
    • https://github.com/jsalatas/plasma-pstate

Applications (installed through Discover)

  • thermal monitor
  • redshift + redshift control
  • latte
  • dropbox
  • netspeed widget

Terminal

  • yakuake (drop-down terminal activated with F12)
  • workrave
sudo apt install yakuake workrave htop
yakuake &
latte-dock &

Tweaks

I’ll try to keep it up-to-date. Hope someone will find it useful. I did already, while installing Neon on the second and third machine 😉

Investigate & reduce the size of Drupal sqlite3 database

Today while performing regular Drupal update and backup, I’ve realised Drupal sqlite3 database sites/default/files/.ht.sqliteis over 440 Mb! I found it peculiar, as our website isn’t storing that much information and the size grew significantly since last time I’ve looked it up couple of months ago. I’ve decided to investigate what’s eating up so much DB space.

Investigate what’s eating up space within your sqlite3 db

There is super useful program called sqlite3_analyzer. This program analyses your database file and reports what’s actually taking your disk space. You can download it from here (download precompiled sqlite3-tools). Note, under Linux you’ll likely need to install 32bit-libraries ie. under Ubuntu/Debian execute

sudo apt install libc6-i386 lib32stdc++6 lib32gcc1 lib32ncurses5 lib32z1  

Once you have the program, simply execute sqlite3_analyzer DB_NAME | less and the program will produce detailed report about your DB space consumption. For me it looked like that:

Can you spot how much space the actual data is taking? Yes, only 4.7% (20k pages). And what’s taking most of the space? Freelist.

Quick googling taught me, that freelist is simply empty space left after deletes or data moving. You may ask, why isn’t it cleaned up later? You see, having entire database with all tables in one file is very handy, but troublesome. Every time given table is edited, the space that is freed isn’t used, but rather marked as freelist. And those regions get cleaned up only when vacuumcommand is issued. This should happen automatically from time-to-time if auto vacuum is enabled. I couldn’t know why isn’t it working by default with Drupal…

Reduce the size of sqlite3 DB file

Nevertheless, I’ve decided to perform vacuummanually. Of course I’ve backed-up the db, just in case (you should always do that!). But sqlite3 .ht.sqlite vacuum returned Error: no such collation sequence: NOCASE_UTF8. At this point, I though maybe simple DB dump and recovery would solve my problem – after all that’s more or less what happens under the hood when you perform vacuum.

sqlite3 .ht.sqlite.bck .dump > db.sql
sqlite3 .ht.sqlite < db.sql

DB recovered after dump was indeed smaller (16 Mb), but it was missing some tables (sqlite3 .ht.sqlite .tables). Interestingly, when I’ve investigated the schema of the missing tables (sqlite3 .ht.sqlite.bck .schema block_content), I’ve realised that all of those contain NOCASE_UTF8 in table schema. I found that really peculiar! After further googling and rather lengthy reading, I’ve realised NOCASE_UTF8 is invalid in sqlite3, but it can be replaced simply with NOCASE.

Replace DB schema directly on sqlite3 db

In the brave (and firstly stupid I though) attempt, I’ve decided just to replace wrong statements directly on the DB file using sed (sed 's/NOCASE_UTF8/NOCASE/g' .ht.sqlite.bck > .ht.sqlite). As expected, the database file got corrupted. This is because all tables location are stored internally in the same file, so truncating some text from the DB file isn’t the wisest idea as I’ve expected. Then, I’ve decided to replace NOCASE_UTF8, but keeping the same size of the statement after replacement using white spaces. To my surprise it worked & allowed me to reduce the size of DB from 440 to 30 Mb 🙂

sed 's/NOCASE_UTF8/NOCASE     /g' .ht.sqlite.bck > .ht.sqlite
sqlite3 .ht.sqlite vacuum
-rw-rw-r--  1 lpryszcz www-data  32638976 Feb 28 13:57 .ht.sqlite
-rw-rw-r-- 1 lpryszcz www-data 451850240 Feb 28 13:45 .ht.sqlite.bck

Finally, to make sure, that there is no data missing between old and new, reduced DB, you can use sqldiff .ht.sqlite .ht.sqlite.bck. It’ll simply report all SQL command that will transform one DB into another and nothing if DB contain identical information.

Hopefully replacing NOCASE_UTF8 with NOCASE will allow auto vacuum to proceed as expected on the Drupal DB in the future!

EDIT: The db failed after update to drupal v8.7.6

Lately, I’ve updated drupal and discovered this morning the drupal db file to be corrupted Error: no such collation sequence: NOCASE_UTF8. This is because in the latest update, drupal rebuilt table definitions and NOCASE_UTF8 came back which causes sqlite vacuum crashing again. The solution is very simple, just recover your db from backup and remove replace NOCASE_UTF8 with NOCASE .

sed -i.bck 's/NOCASE_UTF8/NOCASE     /g' .ht.sqlite

Create book of abstracts from spreadsheet / google forms

Lately a friend of mine complained about interoperability of abstract submissions from numerous applicants. Having the Book of Abstracts is crucial and we faced similar problem organising #NGSchool events.

Note, you’ll need to be somewhat familiar with LaTeX in order to edit the main.tex file to your liking. If you are not afraid of that, the way to proceed is as follows:

  1. Create google form to collect necessary info, such at this one
  2. Create a new spreadsheet to accumulate responses: Responses > Create new spreadsheet
  3. Download responses spreadsheet as Abstracts.xlsx
  4. Clone abstracts repository
  5. git clone https://github.com/lpryszcz/abstracts.git
    cd abstracts
    # install dependencies
    sudo apt install texlive-base texlive-latex-recommended texlive-fonts-recommended texlive-latex-extra make
    
  6. Edit main.tex to your liking
  7. Copy Abstracts.xlsx to the repository
  8. Create pdf
  9. # prepare abstracts.tex
    ./xls2tex.py
    
    # create main.pdf
    make all
    
    # in the case of problems, just run again this point, but first remove the clutter
    rm main.{aux,blg,log,out,toc,pdf}
    

    You’ll find the abstract book in main.pdf.