Deploying Helium-ETL instance with API

Helium blockchain ETL is basically a copy of the Helium Blockchain into a database. That way you can query the helium blockchain history to build apps like monitoring, tax management…

The Helium blockchain API project is connected to the Helium ETL to get an easy access to these data.

I’m currently building a tool to monitor and alarm a fleet of hotspots (stay tuned) and it currently uses the Helium API ( But helium APIs are a bit overloaded and not made to support high rate querying from each of the application any member of the community is developing. So I’ve been decided to make my own instance of ETL and API for my own purpose. This is the experience I’ll describe in this blog post:

Hardware requirement

The blockchain-ETL contains in a postgres database all the helium history… and it starts to be a huge amount of data over 1.5 TB today, growing fast. During the installation process you need to use a snapshot dump to load the data (or you may wait a couple of months to be in sync) so twice the size of the DB is at least needed.

You also need to store the blockchain in a node and this is currently around 0.5TB. And this is growing days after days.

Currently you may consider needing about 6TB of storage to make the ETL working and you should think about a scalable solution. The IO rate is really high and NVMe storage solution is recommended.

I did test different setup and I’m still working on this:

Server TypeConfigStorage typeMonthly CostAvg Bloc Process Duration
Kimsufi4/8 Xeon E 3 @3.7GHz 32GB3x 2TB Sata 7200 RPM RAID 0 20€1.7 minutes
Hetnzer AX10116 Rizen 128GB2x 3.8TB NVMe115€6 seconds
Bare metal4/8 Xeon E3 & 3.7GHz 64GB2x512GB NVME
NA25 seconds
Hardware performance to ingest Helium block in ETL

The average time between 2 blocs generated on the chain is 1 minutes. Any setup with an average bloc processing duration over a minute will not be able to sync. The maximum acceptable processing duration per block sound like 30 seconds to make sure the ETL will be able to sync.

This documentation is based on the kimsufi setup, same can be applied on different hardware setups.

Software requirement

I’m running these software under Linux and as Helium is usually developing with debian/ubuntu based distribution, I preferred to choose Ubuntu to reduce the risk of incompatibility. Erlang installation on recent Ubuntu distribution has been a problem, so finally, my choice has been a Ubuntu 18.04 LTS. This distribution has been deployed on a 256GB partition of my first Hard drive.

Create a RAID0 stripped storage

At first we need to create a storage getting benefit of multiple hard drives. My 3x2TB setup allows using a RAID0 with stripped 3 setup. That way all the hard drives can be solicited in parallel to improve the performance. I’m reaching 3x80MB/s write with 0.5ms IO during database load. This is to be compared with the 80Mb and 6ms IO I experienced in a first configuration where the configuration was linear (all the read/write where on a single disk)

To get this configuration, here is the setup:

# create 3 partitions of 1.7TB on each of the 3 HDD
# all are primary /dev/sdX3 just to simplify reading and have
# the same size
[~root] fdisk /dev/sda   
# loop on sdb and sdc

# Create the Raid 0 logical volume
[~root] pvcreate /dev/sda3 /dev/sdb3 /dev/sdc3
[~root] vgcreate space_vg /dev/sda3 /dev/sdb3 /dev/sdc3
[~root] lvcreate --type raid0 -l 100%FREE --stripes 3 -n space_lv space_vg
[~root] mkfs.ext4 /dev/space_vg/space_lv

# Mount the new volume
[~root] blkid /dev/space_vg/space_lv
/dev/space_vg/space_lv: UUID="e99dceb4-96bd-4b96-86fc-983af2f1f19d" TYPE="ext4"
# Edit /etc/fstab and add a line
# UUID=e99dceb4-96bd-4b96-86fc-983af2f1f19d	/space	ext4	defaults	0	0
# Create directory and mount it
[~root] mkdir /space
[~root] mount /space 

Install Postgresql 12

[~root] wget --quiet -O - | sudo apt-key add -
[~root] echo "deb `lsb_release -cs`-pgdg main" |sudo tee  /etc/apt/sources.list.d/pgdg.list
[~root]apt update
[~root] apt -y install postgresql-12 postgresql-client-12 postgis postgresql-12-postgis-3 postgresql-12-postgis-3-scripts

Make sure the postgresql service is down before modifying the configuration

[~root] systemctl stop postgresql

Then you need to modify db access editing the file /etc/postgresql/12/main/pg_hba.conf:

# change 
local   all             all            peer

# to 
local   all             all            md5

The you need to edit file /etc/postgresql/12/main/postgresql.conf and change the following lines:

data_directory = '/space/var/lib/postgresql/12/main'
external_pid_file = '/space/var/run/postgresql/'
unix_socket_directories = '/space/var/run/postgresql'
shared_buffers = 4GB 
work_mem = 4GB
maintenance_work_mem = 4GB
fsync = off
checkpoint_timeout = 120min
max_wal_size = 2GB
max_connections = 500
stats_temp_directory = '/space/var/run/postgresql/12-main.pg_stat_tmp'
random_page_cost = 1.0

Now we need to move some of the postgresql directories to mach the configuration and use our RAID0 storage:

[~root] mkdir -p /space/var/lib
[~root] mv /var/lib/postgresql /space/var/lib/
[~root] mkdir -p /space/var/run
[~root] mv /var/run/postgresql /space/var/run/
[~root] ln -s /space/var/run/postgresql /var/run/postgresql

Now we can enable / start postgresql

[~root] systemctl enable postgresql
[~root] systemctl start postgresql

Other Software settings and installation

We need to open port 44158 for peer to peer communications

[~root] ufw allow 44158/tcp

We need to extends the number of open file limits

[~root] ulimit -Sn 65535

We need to install the Erlang 22 environment

[~root] wget
[~root] dpkg -i erlang-solutions_2.0_all.deb
[~root] apt-get update
[~root] apt install esl-erlang=1: cmake libsodium-dev libssl-dev build-essential

We also need to install rust environment

[~root] curl --proto '=https' --tlsv1.2 -sSf | sh

And some other useful software. Unzip will be needed to extract snapshot from DeWi and atop is a useful tool to look at the IO and other stats to make sure everything is progressing.

[~root] apt-get install unzip atop

Install Database from DeWi snapshot

You can install from scratch… but this will be really long fro you to get in sync. You can do this if you want to verify all the chain on your own because you don’t trust DeWi… it can make sense, but the best way is to use a snapshot. Loading a snapshot is a long operation, on my system is has taken 16 hours. I’ve seen some 2x mirrored NVMe based system doing this in 6 hours for 2021 December snapshot.

So you need first to get the database snapshot from DeWi repository. Make sure you select the last dump. This is a zip file currently about 530GB:

[~root] cd /space
[~root /space] wget
[~root /space] unzip database_snaptshot*
[~root /space] rm database_snaptshot*

Create the ETL database

[~root] sudo su - postgres
[~postgres] psql
postgres=# CREATE DATABASE etl;
postgres=# quit
[~postgres] exit

Load the database snapshot

[~root] su - postgres
[~postgres] cd /space/root/export/database
[~postgres database] pg_restore -v -d etl -W -j4 -U postgres -Fd .

Now, you take a pause as this processing is really long (16h in my case)

Install DeWi etl snapshot

Now we need to install the blockchain-etl service that will listen the blockchain and update, block after blocks the database. For this we are going to clone the Helium repository. In my point of view it is better to select the last release for github repository. This is done by the line git checkout:

[~root] mkdir /space/helium ; cd /space/helium
[~root helium] git clone
[~root helium] cd blockchain-etl
[~root blockchain-etl] git checkout X.Y.ZZZ
[~root blockchain-etl] cp .env.template .env

Edit the .env file to match your configuration


The google API key is not mandatory, if you decide to get the geocodding in your database, you need to consider the related cost. each api call is $0.005 and you have a monthly free credit equivalent to $200. The rate of API call is large so you can have to pay when you are using this option.

Now you can make the blockchain-etl:

[~root blockchain-etl] make release

Now we need to use the DeWi snapshot data. Let’s start by downloading it and unzip it. Make sure you use the same timestamp as for the database.

[~root space] wget
[~root space] unzip etl_snapshot*
[~root space] rm etl_snapshot*

Then let’s move the data folder from this snapshot to the blockchain-etl

[~root space] mv /space/root/blockchain-etl/_build/dev/rel/blockchain_etl/data /space/helium/blockchain-etl/_build/dev/rel/blockchain_etl/

We can clean the space from the DeWi import. Make sure the database import is terminated before doing this in case you execute these steps in parallel.

[~root space] rm -rf /space/root

Once the database has been imported (not until it is ended) and all these steps executed, we are ready to run the ETL and to start syncing with the blockchain.

[~root space] cd helium/blockchain-etl
[~root blockchain-etl] make start

You can take a look at the syncing progress

[~root blockchain-etl] tail -f _build/dev/rel/blockchain_etl/log/console.log

To focus on block storage progress

[~root blockchain-etl] tail -f _build/dev/rel/blockchain_etl/log/console.log | grep "Stored block"

Install the API

On top of the database we are going to install the helium API to be able to query them for our application. The helium APIs are in the blockchain-http project (the blockchain-api project has been archived).

System prerequisites

Currently I’m doing the tests with a 2VCPU / 4GB RAM virtual machine. I’ll later do some tests using the same machine as the ETL and let you know. The system is Ubuntu 18.04LTS.

The github project details the installation of the erlang installation, the version are a bit different compared to ETL project but I’m not sure it is a real problem.

[root ~] apt-get install git ca-certificates curl gnupg lsb-release 

# install Docker
[root ~] curl -fsSL | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
[root ~] echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
[root ~] apt-get update
[root ~] apt-get install docker-ce docker-ce-cli

# install blockchain-api repo
[root ~] git clone
[root ~] cd blockchain-http

Optionally, as I do not want to open my PostgreSQL port publicly and play with IP-Tables, I did create a port forwarding tunnel between the two servers. This is running in background and requiring to have the API public key deployed in the ETL authorized_key file. The port is forwarded on IP This is the docker0 ip and it allows the containers to join the remote Postgresql server.

[root ~] ssh -fNTML

RUN the API Service

Now we can build the project within a Docker image, then run it

[root ...http] docker build -t helium/api .
[root ...http] mkdir ~/api_data
[root ...http] docker run -d --init \
--restart unless-stopped \
--publish 8080:8080/tcp \
--name api \
--add-port=dockerHost: \
--mount type=bind,source=$HOME/api_data,target=/var/data \
-e DATABASE_RO_URL=postgresql://etl:<Password>@dockerHost:5432/etl \
-e DATABASE_RW_URL=postgresql://etl:<Password>@dockerHost:5432/etl \

Expose the API

Now we need to expose the API with a nginx server and protect it:

[root ~] apt-get install nginx certbot python-certbot-nginx apache2-utils

Add the following site configuration in the /etc/nginx/sites-enabled directory:

upstream api {
        server localhost:8080;
server {

    gzip on;
    gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript application/json;

    root /var/www/html;

    location / {
       index index.html index.htm;
    location / ^(/v1) {
	proxy_pass http://api$request_uri;
	proxy_redirect off;
        proxy_set_header        X-Real-IP $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header        Host $http_host;
        auth_basic "Restricted Content";
        auth_basic_user_file /etc/nginx/.htpasswd;

    listen 80;

Run the nginx server and create an ssl certificate:

[root ~] systemctl restart nginx
[root ~] systemctl enable nginx
[root ~] certbot --nginx

Now let’s create a user allowed to access the API with a password

[root ~] htpasswd -c /etc/nginx/.htpasswd apiUser

You can try the Api with a browser now…


In case you need to change the api rate limit, you can edit file config/sys.config:

{throttle, #{
            %% how much request time is allowed
            request_time => 20000, %% change this
            %% how many requests are allowed
            request_count => 100, %% change this
            %% over how long
            request_interval => 60000,

Test postgresql connection

[root ~] psql -U etl -h localhost -W etl

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.