Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Francois Marier-3
Package: lxc
Version: 1:2.0.5-1
Severity: normal

Since I upgraded to a 4.8.0 kernel, I get these messages in my logs:

  Nov  5 18:02:08 hostname PAM-CGFS[6968]: Corrupt /proc/self/cgroup
  Nov  5 18:05:01 hostname PAM-CGFS[8574]: Corrupt /proc/self/cgroup
  Nov  5 18:05:01 hostname PAM-CGFS[8574]: Failed to get list of controllers
  Nov  5 18:05:01 hostname PAM-CGFS[8574]: Corrupt /proc/self/cgroup
  Nov  5 18:10:01 hostname PAM-CGFS[8612]: Corrupt /proc/self/cgroup
  Nov  5 18:10:01 hostname PAM-CGFS[8612]: Failed to get list of controllers

It seems to come from here:

  https://github.com/lxc/lxcfs/blob/master/pam/pam_cgfs.c#L384

Francois

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.8.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=fr_CA.utf8, LC_CTYPE=fr_CA.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages lxc depends on:
ii  init-system-helpers  1.45
ii  libapparmor1         2.10.95-5
ii  libc6                2.24-5
ii  libcap2              1:2.25-1
ii  liblxc1              1:2.0.5-1
ii  libseccomp2          2.3.1-2
ii  libselinux1          2.6-1
ii  lsb-base             9.20161101
ii  python3              3.5.1-4
pn  python3:any          <none>

Versions of packages lxc recommends:
ii  bridge-utils  1.5-9
ii  debootstrap   1.0.86
ii  dirmngr       2.1.15-8
ii  dnsmasq-base  2.76-4
ii  gnupg         2.1.15-8
ii  iptables      1.6.0-4
ii  libpam-cgfs   2.0.4-1
ii  lxcfs         2.0.4-1
ii  openssl       1.1.0b-2
ii  rsync         3.1.2-1
ii  uidmap        1:4.2-3.2

Versions of packages lxc suggests:
ii  apparmor     2.10.95-5
pn  btrfs-tools  <none>
pn  lua5.2       <none>
pn  lvm2         <none>

-- Configuration Files:
/etc/lxc/default.conf changed:
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = virbr0
lxc.network.hwaddr = 00:FF:AA:xx:xx:xx
lxc.network.ipv4 = 0.0.0.0/24


-- no debconf information

Reply | Threaded
Open this post in threaded view
|

Bug#843450: [pkg-lxc-devel] Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Evgeni Golov-2
control: reassign -1 libpam-cgfs

Hi Francois,

On Sun, Nov 06, 2016 at 10:00:35AM -0800, Francois Marier wrote:
> Since I upgraded to a 4.8.0 kernel, I get these messages in my logs:
>
>   Nov  5 18:02:08 hostname PAM-CGFS[6968]: Corrupt /proc/self/cgroup
>   Nov  5 18:05:01 hostname PAM-CGFS[8574]: Corrupt /proc/self/cgroup
>   Nov  5 18:05:01 hostname PAM-CGFS[8574]: Failed to get list of controllers
>   Nov  5 18:05:01 hostname PAM-CGFS[8574]: Corrupt /proc/self/cgroup
>   Nov  5 18:10:01 hostname PAM-CGFS[8612]: Corrupt /proc/self/cgroup
>   Nov  5 18:10:01 hostname PAM-CGFS[8612]: Failed to get list of controllers

Can you please post the outputs of
 cat /proc/1/mountinfo
 cat /proc/self/cgroup

I suspect that with kernel 4.8 systemd mounts th cgroup2 fs instead of
old cgroup and this breaks libpam-cgfs :(

See also https://github.com/lxc/lxc/issues/1280

Reply | Threaded
Open this post in threaded view
|

Bug#843450: [pkg-lxc-devel] Bug#843450: Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Evgeni Golov-2
On Mon, Nov 07, 2016 at 08:17:09AM +0100, Evgeni Golov wrote:

> control: reassign -1 libpam-cgfs
>
> Hi Francois,
>
> On Sun, Nov 06, 2016 at 10:00:35AM -0800, Francois Marier wrote:
> > Since I upgraded to a 4.8.0 kernel, I get these messages in my logs:
> >
> >   Nov  5 18:02:08 hostname PAM-CGFS[6968]: Corrupt /proc/self/cgroup
> >   Nov  5 18:05:01 hostname PAM-CGFS[8574]: Corrupt /proc/self/cgroup
> >   Nov  5 18:05:01 hostname PAM-CGFS[8574]: Failed to get list of controllers
> >   Nov  5 18:05:01 hostname PAM-CGFS[8574]: Corrupt /proc/self/cgroup
> >   Nov  5 18:10:01 hostname PAM-CGFS[8612]: Corrupt /proc/self/cgroup
> >   Nov  5 18:10:01 hostname PAM-CGFS[8612]: Failed to get list of controllers
>
> Can you please post the outputs of
>  cat /proc/1/mountinfo
>  cat /proc/self/cgroup
>
> I suspect that with kernel 4.8 systemd mounts th cgroup2 fs instead of
> old cgroup and this breaks libpam-cgfs :(
>
> See also https://github.com/lxc/lxc/issues/1280

Can you please try booting with
 systemd.legacy_systemd_cgroup_controller
and see if that fixes the issue for you for now?

Reply | Threaded
Open this post in threaded view
|

Bug#843450: [pkg-lxc-devel] Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Francois Marier-3
In reply to this post by Evgeni Golov-2
On 2016-11-07 at 08:17:09, Evgeni Golov wrote:
> Can you please post the outputs of
>  cat /proc/1/mountinfo

# cat mountinfo | grep cgroup
26 16 0:22 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755
27 26 0:23 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:10 - cgroup2 cgroup rw
29 26 0:25 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,net_cls,net_prio
30 26 0:26 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,perf_event
31 26 0:27 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,blkio
32 26 0:28 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,cpu,cpuacct
33 26 0:29 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,memory
34 26 0:30 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,pids
35 26 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,devices
36 26 0:32 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:20 - cgroup cgroup rw,cpuset
37 26 0:33 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer

>  cat /proc/self/cgroup

# cat /proc/self/cgroup
9:freezer:/
8:cpuset:/
7:devices:/user.slice
6:pids:/user.slice/user-1000.slice/[hidden email]
5:memory:/user.slice
4:cpu,cpuacct:/user.slice
3:blkio:/user.slice
2:perf_event:/
1:net_cls,net_prio:/
0::/user.slice/user-1000.slice/[hidden email]/gnome-terminal-server.service

> I suspect that with kernel 4.8 systemd mounts th cgroup2 fs instead of
> old cgroup and this breaks libpam-cgfs :(
>
> See also https://github.com/lxc/lxc/issues/1280

Indeed that looks exactly like it.

Francois

--
https://fmarier.org/

Reply | Threaded
Open this post in threaded view
|

Bug#843450: [pkg-lxc-devel] Bug#843450: Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Francois Marier-3
In reply to this post by Evgeni Golov-2
On 2016-11-07 at 13:00:31, Evgeni Golov wrote:
> Can you please try booting with
>  systemd.legacy_systemd_cgroup_controller
> and see if that fixes the issue for you for now?

It seems to work fine. Thanks for the pointer!

Should there be a versioned conflict on systemd or the kernel? Maybe a note
in NEWS.Debian?

Francois

--
https://fmarier.org/

Reply | Threaded
Open this post in threaded view
|

Bug#843450: [pkg-lxc-devel] Bug#843450: Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Evgeni Golov-2
In reply to this post by Francois Marier-3
Hey,

On Mon, Nov 07, 2016 at 08:55:21AM -0800, Francois Marier wrote:

> On 2016-11-07 at 08:17:09, Evgeni Golov wrote:
> > Can you please post the outputs of
> >  cat /proc/1/mountinfo
>
> # cat mountinfo | grep cgroup
> 26 16 0:22 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755
> 27 26 0:23 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:10 - cgroup2 cgroup rw
> 29 26 0:25 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,net_cls,net_prio
> 30 26 0:26 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,perf_event
> 31 26 0:27 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,blkio
> 32 26 0:28 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,cpu,cpuacct
> 33 26 0:29 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,memory
> 34 26 0:30 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,pids
> 35 26 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,devices
> 36 26 0:32 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:20 - cgroup cgroup rw,cpuset
> 37 26 0:33 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer
>
> >  cat /proc/self/cgroup
>
> # cat /proc/self/cgroup
> 9:freezer:/
> 8:cpuset:/
> 7:devices:/user.slice
> 6:pids:/user.slice/user-1000.slice/[hidden email]
> 5:memory:/user.slice
> 4:cpu,cpuacct:/user.slice
> 3:blkio:/user.slice
> 2:perf_event:/
> 1:net_cls,net_prio:/
> 0::/user.slice/user-1000.slice/[hidden email]/gnome-terminal-server.service
>
> > I suspect that with kernel 4.8 systemd mounts th cgroup2 fs instead of
> > old cgroup and this breaks libpam-cgfs :(
> >
> > See also https://github.com/lxc/lxc/issues/1280
>
> Indeed that looks exactly like it.

Can you please try systemd 232-3?
You should not need to pass systemd.legacy_systemd_cgroup_controller anymore as the offending change was reverted.

Greets
Evgeni

Reply | Threaded
Open this post in threaded view
|

Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Kevin Locke
On Sat, 2016-11-12 at 10:30 +0100, Evgeni Golov wrote:
> Can you please try systemd 232-3?
> You should not need to pass systemd.legacy_systemd_cgroup_controller anymore as the offending change was reverted.

I am observing the "PAM-CGFS: Failed to get list of controllers" error
message as well, but not "PAM-CGFS: Corrupt /proc/self/cgroup", on a
system running systemd 233-10, libpam-cgfs 2.0.7-1, and kernel 4.12.

Adding systemd.legacy_systemd_cgroup_controller to the kernel command
line does avoid the error messages.

For reference, I have attached the relevant portion of
/proc/1/mountinfo and /proc/self/cgroup as previously requested, both
booting with and without systemd.legacy_systemd_cgroup_controller.

--
Cheers,      |  [hidden email]    | XMPP: [hidden email]
Kevin        |  https://kevinlocke.name  | IRC:   kevinoid on freenode

26 17 0:22 / /sys/fs/cgroup rw shared:9 - tmpfs tmpfs rw,mode=755
27 26 0:23 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime shared:10 - cgroup2 cgroup rw
28 26 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:11 - cgroup cgroup rw,xattr,name=systemd
30 26 0:26 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,freezer
31 26 0:27 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,blkio
32 26 0:28 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,cpu,cpuacct
33 26 0:29 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpuset
34 26 0:30 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,net_cls,net_prio
35 26 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,memory
36 26 0:32 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:20 - cgroup cgroup rw,devices
37 26 0:33 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,pids
38 26 0:34 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:22 - cgroup cgroup rw,perf_event

10:perf_event:/
9:pids:/user.slice/user-1000.slice/session-1.scope
8:devices:/user.slice
7:memory:/user.slice
6:net_cls,net_prio:/
5:cpuset:/
4:cpu,cpuacct:/user.slice
3:blkio:/user.slice
2:freezer:/
1:name=systemd:/user.slice/user-1000.slice/session-1.scope
0::/user.slice/user-1000.slice/session-1.scope

26 17 0:22 / /sys/fs/cgroup rw shared:9 - tmpfs tmpfs rw,mode=755
27 26 0:23 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:10 - cgroup cgroup rw,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd
29 26 0:25 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,net_cls,net_prio
30 26 0:26 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,cpu,cpuacct
31 26 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,cpuset,clone_children
32 26 0:28 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,perf_event
33 26 0:29 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,memory
34 26 0:30 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,pids
35 26 0:31 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,blkio
36 26 0:32 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:20 - cgroup cgroup rw,freezer
37 26 0:33 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,devices

10:devices:/user.slice
9:freezer:/user/kevin/0
8:blkio:/user.slice
7:pids:/user.slice/user-1000.slice/session-1.scope
6:memory:/user/kevin/0
5:perf_event:/
4:cpuset:/
3:cpu,cpuacct:/user.slice
2:net_cls,net_prio:/
1:name=systemd:/user.slice/user-1000.slice/session-1.scope
Reply | Threaded
Open this post in threaded view
|

Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Evgeni Golov-2
Hi,

On Mon, Jul 10, 2017 at 02:25:04PM -0600, Kevin Locke wrote:

> On Sat, 2016-11-12 at 10:30 +0100, Evgeni Golov wrote:
> > Can you please try systemd 232-3?
> > You should not need to pass systemd.legacy_systemd_cgroup_controller anymore as the offending change was reverted.
>
> I am observing the "PAM-CGFS: Failed to get list of controllers" error
> message as well, but not "PAM-CGFS: Corrupt /proc/self/cgroup", on a
> system running systemd 233-10, libpam-cgfs 2.0.7-1, and kernel 4.12.
>
> Adding systemd.legacy_systemd_cgroup_controller to the kernel command
> line does avoid the error messages.
>
> For reference, I have attached the relevant portion of
> /proc/1/mountinfo and /proc/self/cgroup as previously requested, both
> booting with and without systemd.legacy_systemd_cgroup_controller.

Thanks, can you please try the updated lxcfs/libpam-cgfs from https://people.debian.org/~evgeni/tmp/lxcfs/?

Regards
Evgeni

Reply | Threaded
Open this post in threaded view
|

Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Kevin Locke
On Fri, 2017-07-14 at 09:35 +0200, Evgeni Golov wrote:

> On Mon, Jul 10, 2017 at 02:25:04PM -0600, Kevin Locke wrote:
>> On Sat, 2016-11-12 at 10:30 +0100, Evgeni Golov wrote:
>>> Can you please try systemd 232-3?
>>> You should not need to pass systemd.legacy_systemd_cgroup_controller anymore as the offending change was reverted.
>>
>> I am observing the "PAM-CGFS: Failed to get list of controllers" error
>> message as well, but not "PAM-CGFS: Corrupt /proc/self/cgroup", on a
>> system running systemd 233-10, libpam-cgfs 2.0.7-1, and kernel 4.12.
>
> Thanks, can you please try the updated lxcfs/libpam-cgfs from https://people.debian.org/~evgeni/tmp/lxcfs/?

After installing the linked lxcfs and libpam-cgfs packages and
rebooting, there are no error messages from PAM-CGFS in the logs and
my system appears to be working correctly.

I'm not really sure what libpam-cgfs does.  Are there any functional
tests that I should do to ensure it is working as expected?

Thanks!

--
Cheers,      |  [hidden email]    | XMPP: [hidden email]
Kevin        |  https://kevinlocke.name  | IRC:   kevinoid on freenode

Reply | Threaded
Open this post in threaded view
|

Bug#843450: lxc: Corrupt /proc/self/cgroup / Failed to get list of controllers

Evgeni Golov-2
Hi,

On Fri, Jul 14, 2017 at 08:59:14AM -0600, Kevin Locke wrote:

> On Fri, 2017-07-14 at 09:35 +0200, Evgeni Golov wrote:
> > On Mon, Jul 10, 2017 at 02:25:04PM -0600, Kevin Locke wrote:
> >> On Sat, 2016-11-12 at 10:30 +0100, Evgeni Golov wrote:
> >>> Can you please try systemd 232-3?
> >>> You should not need to pass systemd.legacy_systemd_cgroup_controller anymore as the offending change was reverted.
> >>
> >> I am observing the "PAM-CGFS: Failed to get list of controllers" error
> >> message as well, but not "PAM-CGFS: Corrupt /proc/self/cgroup", on a
> >> system running systemd 233-10, libpam-cgfs 2.0.7-1, and kernel 4.12.
> >
> > Thanks, can you please try the updated lxcfs/libpam-cgfs from https://people.debian.org/~evgeni/tmp/lxcfs/?
>
> After installing the linked lxcfs and libpam-cgfs packages and
> rebooting, there are no error messages from PAM-CGFS in the logs and
> my system appears to be working correctly.
>
> I'm not really sure what libpam-cgfs does.  Are there any functional
> tests that I should do to ensure it is working as expected?

libpam-cgfs is needed to allow you to launch user-owned containers, as it sets up all the cgroups for you.

If you don't use user-containers, than you don't need it.

Cheers
Evgeni