Bug#852787: gitlab: Doesn't restart services properly on OOM kill
-----BEGIN PGP SIGNED MESSAGE-----
we have low mem instance of gitlab - it used to have 2GB of ram, now it has 3GB.
Sometimes, it runs out of memory. (usually at 22:xx )
OOM killer usually kills one of sidekiq children.
It triggers restart of sidekiq, but also of other gitlab services.
But gitlab-sidekiq is the only one, which is started, other are left dead.
Can this be somehow addressed in systemd service dependecies?
As mentioned by Libor Klepáč the gitlab-sidekiq.service returns after OOM killer
due to the "Restart=on-abnormal" directive, but every signal send to the sidekiq
process will result in the shutdown of gitlab.service as the
gitlab-sidekiq.service is referenced as a dependency in the "BindsTo" directive.
In our scenario the Sidekiq MemoryKiller sends a SIGTERM after a 15min grace
period or a SIGKILL after additional 30sec. Both signals have the same effect on
the gitlab.service, which shuts down completely.
A patch to fix the configuration of gitlab.service and gitlab-sidekiq.service is
attached. We configured systemd to not tear down the whole gitlab.service when
gitlab-sidekiq.service terminates by removing gitlab-sidekiq.service from the
"BindsTo" directive and using a softer "Wants" directive for
gitlab-sidekiq.service in the gitlab.service instead.
The gitlab-sidekiq.service itself is configured with "Restart" directive to
restart "always" instead of just "on-abnormal", so that it even restarts when
it gets a clean exit code 0 by the SIGTERM of the Sidekiq MemoryKiller.
As recommended by GitLab and used in the Omnibus package, we should use the
Sidekiq MemoryKiller as a measure against the memory leaks (patch attached).
System Developer - Data Center Automation
Greifswalder Str. 207
D - 10405 Berlin