Introduction to systemd

Turn your applications into managed and monitored services without the need for external applications.

Introduction

systemd is a powerful (though controversial) suite used to manage and configure Unix systems. One of the most widespread uses of systemd is to manage the startup system (init system), to easily manage processes and services. Although systemd is capable of managing many more things, in this article we will see how to use systemd to create, start, stop, restart and generally keep our own services healthy and monitored.

We are going to use an application created in Node.js as an example. While we could use one of the many process managers that are installed via npm, such as pm2, forever or supervisor, the problem is that these tools are tied to Node.js and their future depends on how well these libraries are maintained. In the case of systemd, it is a very robust suite that comes standard in many Linux distributions and knowing how it works gives us new opportunities.

Another problem is that after rebooting the machine we would need a service or task to run our Node.js-based process manager again. So, we are going to do without these types of tools, although we could combine them and get the best of both worlds.

First we will see how to create units, then how to manage them, and finally how to monitor them.

systemd

Systemd works with 2 concepts: units and targets.

Units

Units are components (such as services) that should work as independent pieces of software. In this case, a unit would be for example MySQL, Redis or Node.js.

Unless your Linux distribution is very special, you will store your units in the path /etc/systemd/system with a name like app.service. For user services it is also possible to store them in /home/$USER/.config/systemd/user.

A unit is composed of sections, such as [Unit], [Service] or [Install]. These sections contain directives and here we will see a few of them needed for our task. If you want to see all available sections or their directives in detail, see the official documentation.

In the case of [Unit], we will find metadata that defines the unit, such as Description to add a text description, or Requires, Wants, Before, After, BindsTo or Conflicts that serve to relate our unit with other units.

For example if our Node.js unit indicates Requires=redis.service, the Redis unit will start before the Node.js unit starts.

The [Install] section is in charge of interacting with the targets, something we will see later.

Finally, in [Service] we will indicate the operation of our unit. A few useful directives are:

  • Flow control directives: ExecStart, ExecStartPre, ExecStartPost, ExecReload, ExecStop and ExecStopPost.
  • Failure control directives: RestartSec (time to wait to restart the unit after a failure), Restart (directive to define our restart policy whose value can be always, on-success, on-failure, on-abnormal, on-abort or on-watchdog) or TimeoutSec (directive with which we specify how long to wait for the unit to be considered as failed).
  • Other directives: As an example, Environment would be in charge of passing environment variables to our application (we can use as many as we want), while User and Group would be in charge of assigning execution permissions. PIDFile is also useful when we want to associate a pid file to our service.

You can check the documentation about these sections as well as their directives in Unit/Install and Service.

Let's get down to business. Our Node.js based service will be defined in a file called app.service with the following content:

app.service
[Unit]
Description=Node HTTP service

[Service]
Environment="MY_PORT=3000"
ExecStart=/usr/bin/node /srv/http/app/index.js
Restart=on-failure

[Install]
WantedBy=multi-user.target

Simple, isn't it?

We could use more directives like RestartSec but in principle we are going to leave those directives with their default values.

As for the flow control directives (like ExecStart), there is an option that will help us very often. Changing ExecStart=/usr/bin/node /srv/http/app/index.js to ExecStart=-/usr/bin/node /srv/http/app/index.js (i.e. adding a hyphen before the command) will prevent the unit from being considered as failed in case the command returns a result considered as erroneous, thus avoiding restart, the report in the logs, etc. This is especially useful when we want to execute commands that are optional and we do not care if the output is valid or erroneous, since the error will be muted and ignored (silenced).

If you are wondering why we have not added After=network.target, it is because the multi-user target already depends on the network connection. Later we will see what targets are and how to enable our service in one (in this case multi-user) so that it is in charge of starting our process automatically after starting/rebooting the system.

Templates

Although we have not used templates for our unit, it is worth mentioning this functionality. As you can guess, its use is perfect for clustering components.

To generate a template we will call the file app@.service (add an at symbol). This will be a placeholder for our instances to be named app@1.service, app@2.service, etc.

We would then use the argument generated through the %i variable. We could generate an environment variable like Environment=LISTEN_PORT=300%i, and so in our application we would receive said variable in order to execute several instances of our application running under different ports (3001, 3002, etc).

Targets

Targets are used to group units. They can be compared to runlevels in other startup systems although unlike these, a unit can belong to several targets at the same time and even a target can group other targets. Let's see some examples to understand it better.

A simple case could be that we have a service that starts with the system and needs a graphical interface to work. In that case, this unit would be part of the target called graphical.target.

Another example would be a service that is in charge of playing music from some online radio and is started at system startup. It makes sense that this unit depends on sound.target and network.target, doesn't it?

In the case of targets grouping targets, the clearest case is multi-user, a target that when executed indicates to our system that it is ready to accept logins from system users. This target depends directly and indirectly on others, such as systemd-networkd.target, swap.target or getty.target, so if we have the multi-user target available, it will be because the others have been successfully started.

When we install a unit in a target what is actually created is a symbolic link (symlink). These links are found in the path /etc/systemd/system/*.target.wants/ (for example /etc/systemd/system/multi-user.target.wants/ in the case of multi-user.target). In the case of using the user path it would be /home/$USER/.config/systemd/user/*.target.wants/.

systemctl

Right about now you may be wondering how to boot a unit, install it, etc. Welcome to systemctl.

systemctl is a command to manage and control the operation of systemd. In this section we will look at a few general but useful uses.

To be, or not to be

systemctl runs under the root user so it needs administrator permissions, unless we use the --user parameter, useful for running services under our user.

Start and stop

To start a unit:

systemctl --user start app.service

To stop a unit:

systemctl --user stop app.service

Restart and reload

To reboot a unit:

systemctl --user restart app.service

If our application is able to reload its configuration without restarting we can use:

systemctl --user reload app.service

And if we are not sure about it:

systemctl --user reload-or-restart app.service

Enable and disable

In order for our unit to boot at system startup (or rather, when the target associated with our unit starts), we must enable the unit using the following command:

systemctl --user enable app.service
It doesn't work!

Enabling a unit does not make it start right now. For that we must use systemctl start app.service or simply systemctl --user enable --now app.service.

To disable it:

systemctl --user disable app.service

Remember that what this does is to create or delete a symbolic link.

Status

To check the status of our application, we will use:

systemctl --user status app.service
● app.service - Node HTTP service
   Loaded: loaded (/home/$USER/.config/systemd/user/app.service; disabled; vendor preset: disabled)
   Active: active (running) since jue 2016-10-06 18:58:00 CEST; 5s ago
 Main PID: 30453 (node)
    Tasks: 6 (limit: 4915)
   Memory: 8.4M
      CPU: 77ms
   CGroup: /system.slice/app.service
           └─30453 /usr/bin/node /srv/http/app/index.js

oct 06 18:58:00 earth systemd[1]: Started Node HTTP service.
oct 06 18:58:00 earth node[30453]: Server running at http://127.0.0.1:3000/

We can also check the status of our unit in a more direct way using several commands:

systemctl --user is-active app.service
systemctl --user is-enabled app.service
systemctl --user is-failed app.service

Masking and unmasking

If we need to mask our unit from booting in any way (either automatically or manually), we can use:

systemctl --user mask app.service

This will cause our unit to point to /dev/null and not start.

And to unmask it:

systemctl --user unmask app.service

View, edit and delete units

These operations can be performed separately or via systemctl. To view the contents of a unit, systemctl provides the following command:

systemctl --user cat app.service

And if we want to access to a lower level information we do it through:

systemctl --user show app.service

To edit a unit we have edit, although it does not work as you would expect. When we edit a unit we are actually creating a drop-in. A folder associated with each unit called /etc/systemd/system/system/app.service.d/ (or /home/$USER/.config/systemd/user/app.service.d/) is created and inside it a file is created that is responsible for providing changes to the unit. In this file we can replace directives as we wish, add new ones, or return directives to their initial state.

systemctl --user edit app.service

If we want to edit the unit without using drop-ins, we do it as follows:

systemctl --user edit --full app.service

We can delete both drop-ins and the entire unit using the rm command.

If we have made modifications without using systemctl edit or have deleted something using rm, we will inform systemctl of our changes by running this command:

systemctl --user daemon-reload

Instances

If we use templates we can perform operations to multiple instances as follows:

systemctl --user start app@1.service
systemctl --user start app@2.service

We can also perform operations on several units at once using the following syntax (this depends on our command interpreter):

systemctl --user start app@{1,2,3,4,5}.service
systemctl --user start app@{1..5}.service

Targets

Of course systemctl provides us with commands to perform operations on targets, such as changing the system's default target, setting the machine to a specific target, etc. In our case, we are only going to look at the commands to query the list of targets we have available on our machine.

Summarized version:

systemctl --user list-unit-files --type=target

Detailed version:

systemctl --user list-units --type=target

Using --type we can filter units of other types.

Dependencies

To check the list of dependencies of both our units and targets we will use:

systemctl --user list-dependencies app.service
systemctl --user list-dependencies multi-user.target

journalctl

We already know how to create, start, stop and generally manage our applications using systemd, but this complete suite still has something powerful to offer: journalctl.

journalctl` is a command to view logs of our units (or of the system in general). Let's take a look at a few useful commands to make our day-to-day life easier.

If we simply execute journalctl we will see logs of our entire system from the beginning of time. The first argument that will come in handy is --utc, which as you might have guessed will show us the logs with date and time according to UTC.

Root or not root

Remember that journalctl also runs under the root user. If you are going to monitor system services use sudo, or the --user parameter if you are only going to query logs for a service running under your user.

Filter by machine startup

journalctl is able to segment our logs by system startup. To view the logs since the last machine reboot we will use:

journalctl -b

If we want to see the logs from the penultimate restart, we will use:

journalctl -b -1

Then it would go -2, -3... and so on. If we want to query how many restarts there have been as well as their position, identifier and date range, we will use:

journalctl --list-boots
-2 ae4450adc26e47c69f943bf54c1ec488 dom 2016-08-28 08:43:36 CEST—sáb 2016-09-10 23:29:04 CEST
-1 16e0c81dff134320920dc07822ddc4b3 sáb 2016-09-10 23:29:38 CEST—mié 2016-09-21 18:54:58 CEST
 0 31c4459b64c1449184700bd6cccb09aa mié 2016-09-21 19:17:47 CEST—jue 2016-10-06 19:55:08 CEST

Filter by date

To filter by date we will use the since and until arguments, specifying the value in YYYY-MM-DD HH:MM:SS format. Example:

journalctl --since "2016-10-01" --until "2016-10-07 01:00"
Skip data

We can omit fragments such as seconds or the whole time and these will take the value 00.

It also recognizes other relative formats such as yesterday, today or now:

journalctl --since yesterday --until "2 hours ago"

Filter by unit

If we only want to see records from a particular unit we will use the u argument (of unit):

journalctl -u app.service

Other arguments

It is worth noting other arguments such as the output format which accepts values like json or json-pretty among others. Example:

journalctl -o json
journalctl -o json-pretty

The n argument will show us the last N records, being by default 10 if we do not specify a number.

journalctl -n
journalctl -n 20
journalctl -n 100

And last but not least, the f argument will follow the real-time log updates, just as we would do with tail -f:

journalctl -f

Conclusion

We have seen how systemd works in broad strokes: how to create units, how to manage them using systemctl, and how journalctl becomes our powerful ally to monitor and debug the status of our processes and services.

Of course the potential of this entire suite would be enough to write a bible. Here we have seen, in a quick way, the basic operation to be able to start using systemd for our services, avoiding to use specific tools that would only be useful for specific cases.

You can support me so that I can dedicate even more time to writing articles and have resources to create new projects. Thank you!