Bug#880502: lxc: cannot start container with kernel 4.13.10

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug#880502: lxc: cannot start container with kernel 4.13.10

Antonio Terceiro-3
Control: retitle -1 lxc: cannot start container with kernel 4.13.10

On Wed, Nov 01, 2017 at 11:32:31AM -0200, Antonio Terceiro wrote:

> Package: lxc
> Version: 1:2.0.9-3
> Severity: serious
>
> I'm filing this in lxc initially as I don't know exactly where the issue
> is yet. We will probably want to reassign it.
>
> Something other than lxc itself changed recently in unstable which makes
> lxc not able to start a Debian container:
>
> # lxc-start -n autopkgtest-sid-amd64
> lxc-start: lxccontainer.c: wait_on_daemonized_start: 754 Received container state "ABORTING" instead of "RUNNING"
> lxc-start: tools/lxc_start.c: main: 368 The container failed to start.
> lxc-start: tools/lxc_start.c: main: 370 To get more details, run the container in foreground mode.
> lxc-start: tools/lxc_start.c: main: 372 Additional information can be obtained by setting the --logfile and --logpriority options.
> # cat /var/lib/lxc/autopkgtest-sid-amd64/autopkgtest-sid-amd64.log
>       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set
>       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1
>       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file
>       lxc-start 20171101123914.655 ERROR    lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
>       lxc-start 20171101123914.701 ERROR    lxc_container - lxccontainer.c:wait_on_daemonized_start:754 - Received container state "ABORTING" instead of "RUNNING"
>       lxc-start 20171101123914.701 ERROR    lxc_start - start.c:__lxc_start:1530 - Failed to spawn container "autopkgtest-sid-amd64".
>       lxc-start 20171101123914.701 ERROR    lxc_start_ui - tools/lxc_start.c:main:368 - The container failed to start.
>       lxc-start 20171101123914.701 ERROR    lxc_start_ui - tools/lxc_start.c:main:370 - To get more details, run the container in foreground mode.
>       lxc-start 20171101123914.701 ERROR    lxc_start_ui - tools/lxc_start.c:main:372 - Additional information can be obtained by setting the --logfile and --logpriority options.
>       lxc-start 20171101132533.307 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set
>       lxc-start 20171101132533.307 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1
>       lxc-start 20171101132533.307 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file
>       lxc-start 20171101132533.307 ERROR    lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
>       lxc-start 20171101132533.373 ERROR    lxc_container - lxccontainer.c:wait_on_daemonized_start:754 - Received container state "ABORTING" instead of "RUNNING"
>       lxc-start 20171101132533.374 ERROR    lxc_start_ui - tools/lxc_start.c:main:368 - The container failed to start.
>       lxc-start 20171101132533.374 ERROR    lxc_start - start.c:__lxc_start:1530 - Failed to spawn container "autopkgtest-sid-amd64".
>       lxc-start 20171101132533.374 ERROR    lxc_start_ui - tools/lxc_start.c:main:370 - To get more details, run the container in foreground mode.
>       lxc-start 20171101132533.374 ERROR    lxc_start_ui - tools/lxc_start.c:main:372 - Additional information can be obtained by setting the --logfile and --logpriority options.
>
>
> This is not happening on testing yet. When I upgrade a testing VM to
> unstable, I can still start the container before a reboot. After a
> reboot, I cannot start a container anymore. Maybe it's related to some
> kernel change?
>
> I'm copying debian-kernel in case someone there can provide some insight.
So, I tried downgrading the kernel to the one in testing, rebooted, and
now I can start containers again, So this is being caused by a change in
the kernel between 4.13.4-2 and 4.13.10-1

I still need to study the lxc code path that is being triggered to be
able to provide more useful information. Since the issue is definitively
related to apparmor, I am also copying the apparmor team in case they
have any input to provide.

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-lxc-devel] Bug#880502: lxc: cannot start container with kernel 4.13.10

Evgeni Golov-2
Ohai,

On Wed, Nov 01, 2017 at 12:00:12PM -0200, Antonio Terceiro wrote:

> >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set
> >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1
> >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file
> So, I tried downgrading the kernel to the one in testing, rebooted, and
> now I can start containers again, So this is being caused by a change in
> the kernel between 4.13.4-2 and 4.13.10-1
>
> I still need to study the lxc code path that is being triggered to be
> able to provide more useful information. Since the issue is definitively
> related to apparmor, I am also copying the apparmor team in case they
> have any input to provide.

Can you try to set "lxc.aa_allow_incomplete = 1" in your config?
LXC expects Ubuntus patched kernels when it comes to AppArmor, not the
upstream ones :(

And I think Debian enabled AppArmor by default in the latest kernels.

Evgeni

Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-lxc-devel] Bug#880502: lxc: cannot start container with kernel 4.13.10

Ben Hutchings-3
On Wed, 2017-11-01 at 15:38 +0100, Evgeni Golov wrote:

> Ohai,
>
> On Wed, Nov 01, 2017 at 12:00:12PM -0200, Antonio Terceiro wrote:
> > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set
> > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1
> > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file
> >
> > So, I tried downgrading the kernel to the one in testing, rebooted, and
> > now I can start containers again, So this is being caused by a change in
> > the kernel between 4.13.4-2 and 4.13.10-1
> >
> > I still need to study the lxc code path that is being triggered to be
> > able to provide more useful information. Since the issue is definitively
> > related to apparmor, I am also copying the apparmor team in case they
> > have any input to provide.
>
> Can you try to set "lxc.aa_allow_incomplete = 1" in your config?
> LXC expects Ubuntus patched kernels when it comes to AppArmor, not the
> upstream ones :(
>
> And I think Debian enabled AppArmor by default in the latest kernels.
Yes, that's the change made in 4.13.10-1.

Ben.

--
Ben Hutchings
Make three consecutive correct guesses and you will be considered an
expert.


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-lxc-devel] Bug#880502: lxc: cannot start container with kernel 4.13.10

Antonio Terceiro-3
In reply to this post by Evgeni Golov-2
On Wed, Nov 01, 2017 at 03:38:23PM +0100, Evgeni Golov wrote:

> Ohai,
>
> On Wed, Nov 01, 2017 at 12:00:12PM -0200, Antonio Terceiro wrote:
> > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set
> > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1
> > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file
> > So, I tried downgrading the kernel to the one in testing, rebooted, and
> > now I can start containers again, So this is being caused by a change in
> > the kernel between 4.13.4-2 and 4.13.10-1
> >
> > I still need to study the lxc code path that is being triggered to be
> > able to provide more useful information. Since the issue is definitively
> > related to apparmor, I am also copying the apparmor team in case they
> > have any input to provide.
>
> Can you try to set "lxc.aa_allow_incomplete = 1" in your config?
> LXC expects Ubuntus patched kernels when it comes to AppArmor, not the
> upstream ones :(
>
> And I think Debian enabled AppArmor by default in the latest kernels.
Didn't help. At least now we have a different error message:

lxc-start 20171102130036.516 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:234 - No such file or directory - failed to change apparmor profile to lxc-container-default-cgns
lxc-start 20171102130036.516 ERROR    lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
lxc-start 20171102130036.564 ERROR    lxc_container - lxccontainer.c:wait_on_daemonized_start:754 - Received container state "ABORTING" instead of "RUNNING"
lxc-start 20171102130036.564 ERROR    lxc_start - start.c:__lxc_start:1530 - Failed to spawn container "test".
lxc-start 20171102130036.564 ERROR    lxc_start_ui - tools/lxc_start.c:main:368 - The container failed to start.
lxc-start 20171102130036.564 ERROR    lxc_start_ui - tools/lxc_start.c:main:370 - To get more details, run the container in foreground mode.
lxc-start 20171102130036.564 ERROR    lxc_start_ui - tools/lxc_start.c:main:372 - Additional information can be obtained by setting the --logfile and --logpriority options.

I guess we will need to fix the apparmor support in lxc to work with the
upstream kernel. :-/

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-lxc-devel] Bug#880502: lxc: cannot start container with kernel 4.13.10

Antonio Terceiro-3
Control: severity -1 important

On Thu, Nov 02, 2017 at 11:04:10AM -0200, Antonio Terceiro wrote:

> On Wed, Nov 01, 2017 at 03:38:23PM +0100, Evgeni Golov wrote:
> > Ohai,
> >
> > On Wed, Nov 01, 2017 at 12:00:12PM -0200, Antonio Terceiro wrote:
> > > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set
> > > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1
> > > >       lxc-start 20171101123914.655 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file
> > > So, I tried downgrading the kernel to the one in testing, rebooted, and
> > > now I can start containers again, So this is being caused by a change in
> > > the kernel between 4.13.4-2 and 4.13.10-1
> > >
> > > I still need to study the lxc code path that is being triggered to be
> > > able to provide more useful information. Since the issue is definitively
> > > related to apparmor, I am also copying the apparmor team in case they
> > > have any input to provide.
> >
> > Can you try to set "lxc.aa_allow_incomplete = 1" in your config?
> > LXC expects Ubuntus patched kernels when it comes to AppArmor, not the
> > upstream ones :(
> >
> > And I think Debian enabled AppArmor by default in the latest kernels.
>
> Didn't help. At least now we have a different error message:
>
> lxc-start 20171102130036.516 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:234 - No such file or directory - failed to change apparmor profile to lxc-container-default-cgns
> lxc-start 20171102130036.516 ERROR    lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
> lxc-start 20171102130036.564 ERROR    lxc_container - lxccontainer.c:wait_on_daemonized_start:754 - Received container state "ABORTING" instead of "RUNNING"
> lxc-start 20171102130036.564 ERROR    lxc_start - start.c:__lxc_start:1530 - Failed to spawn container "test".
> lxc-start 20171102130036.564 ERROR    lxc_start_ui - tools/lxc_start.c:main:368 - The container failed to start.
> lxc-start 20171102130036.564 ERROR    lxc_start_ui - tools/lxc_start.c:main:370 - To get more details, run the container in foreground mode.
> lxc-start 20171102130036.564 ERROR    lxc_start_ui - tools/lxc_start.c:main:372 - Additional information can be obtained by setting the --logfile and --logpriority options.
>
> I guess we will need to fix the apparmor support in lxc to work with the
> upstream kernel. :-/
A brief summary of our IRC conversation from earlier.

I can also reproduce this on:

- stable, booting with security=apparmor
- unstable, with the latest upstream code, built from git
- with or without the apparmor package installed

The workaround that works is using the setting in the container
configuration:

lxc.aa_profile = unconfined

with disables apparmor entirely.

I have just uploaded lxc 1:2.0.9-4 setting this for all containers. This
is not the greatest solution, but it's also not worse than the state of
affairs before apparmor was enabled by default in the Debian kernel: it
was already not possible to use lxc with apparmor in Debian.

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-apparmor] Bug#880502: [pkg-lxc-devel] Bug#880502: lxc: cannot start container with kernel 4.13.10

Christian Boltz-6
Hello,

seeing the AppArmor denials would be helpful to get this fixed ;-)

Please either
    grep -i apparmor /var/log/syslog
or, if you have auditd installed, check
    /var/log/audit/audit.log

For more details, see https://wiki.debian.org/AppArmor/Debug


Regards,

Christian Boltz
--
> Anyway, what does our mission statement say?                                                                                                                                    
"Have a lot of fun..."                                                                                                                                                            
[> Per Jessen and Greg KH in opensuse-factory]                                                                                                                                    

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Bug#880502: lxc: cannot start container with kernel 4.13.10

Evgeni Golov-2
Hi,

On Thu, Nov 02, 2017 at 07:09:10PM +0100, Christian Boltz wrote:
> seeing the AppArmor denials would be helpful to get this fixed ;-)

I think the issue is different.

Looking at the LXC log, we see the following:
lxc-start 20171102130036.516 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:234 - No such file or directory - failed to change apparmor profile to lxc-container-default-cgns

And indeed, we see no profiles:
# aa-status
apparmor module is loaded.
0 profiles are loaded.
0 profiles are in enforce mode.
0 profiles are in complain mode.
0 processes have profiles defined.
0 processes are in enforce mode.
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.

I think the issue is that when LXC is installed *before* AppArmor is
enabled, the postinst snippet generated by dh_apparmor [1] is not
registering any profiles. And now that AppArmor is enabled, the profile
is missing and cannot be applied.

This is just a theory, I did not have time to actually reproduce and try
it.

Evgeni

[1]
# Automatically added by dh_apparmor/2.11.1-2
aa_is_enabled() {
   if command aa-enabled >/dev/null 2>&1; then
      # apparmor >= 2.10.95-2
      aa-enabled --quiet 2>/dev/null
   else
      # apparmor << 2.10.95-2
      # (This should be removed once Debian Stretch and Ubuntu 18.04 are out.)
      rc=0
      aa-status --enabled 2>/dev/null || rc=$?
      [ "$rc" = 0 ] || [ "$rc" = 2 ]
   fi
}

if [ "$1" = "configure" ]; then
    APP_PROFILE="/etc/apparmor.d/usr.bin.lxc-start"
    if [ -f "$APP_PROFILE" ]; then
        # Add the local/ include
        LOCAL_APP_PROFILE="/etc/apparmor.d/local/usr.bin.lxc-start"

        test -e "$LOCAL_APP_PROFILE" || {
            tmp=`mktemp`
        cat <<EOM > "$tmp"
# Site-specific additions and overrides for usr.bin.lxc-start.
# For more details, please see /etc/apparmor.d/local/README.
EOM
            mkdir `dirname "$LOCAL_APP_PROFILE"` 2>/dev/null || true
            mv -f "$tmp" "$LOCAL_APP_PROFILE"
            chmod 644 "$LOCAL_APP_PROFILE"
        }

        # Reload the profile, including any abstraction updates
        if aa_is_enabled; then
            apparmor_parser -r -T -W "$APP_PROFILE" || true
        fi
    fi
fi
# End automatically added section

Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-apparmor] Bug#880502: lxc: cannot start container with kernel 4.13.10

Felix Geyer-6
Hi,

On 02.11.2017 20:09, Evgeni Golov wrote:

> Hi,
>
> On Thu, Nov 02, 2017 at 07:09:10PM +0100, Christian Boltz wrote:
>> seeing the AppArmor denials would be helpful to get this fixed ;-)
>
> I think the issue is different.
>
> Looking at the LXC log, we see the following:
> lxc-start 20171102130036.516 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:234 - No such file or directory - failed to change apparmor profile to lxc-container-default-cgns
>
> And indeed, we see no profiles:
> # aa-status
> apparmor module is loaded.
> 0 profiles are loaded.
> 0 profiles are in enforce mode.
> 0 profiles are in complain mode.
> 0 processes have profiles defined.
> 0 processes are in enforce mode.
> 0 processes are in complain mode.
> 0 processes are unconfined but have a profile defined.
>
> I think the issue is that when LXC is installed *before* AppArmor is
> enabled, the postinst snippet generated by dh_apparmor [1] is not
> registering any profiles. And now that AppArmor is enabled, the profile
> is missing and cannot be applied.

There are two issues:

lxc expects mount mediation to be present in AppArmor. This isn't upstream (yet) so it's missing
from the Debian kernel too.
As already mentioned there is a lxc.aa_allow_incomplete setting to ignore this check.
However lxc-apparmor-load doesn't honor this setting and still skips loading profiles.


More fundamentally lxc makes the assumption that the AppArmor userspace tools are available if
AppArmor is active in the kernel.
When starting a container lxc detects that AppArmor is active and tries to transition to a
profile. This fails if the apparmor package hasn't been installed as lxc has no way to load profiles.


To fix this:
- lxc needs to stop checking for AppArmor mount mediation. This might makes sense for distros that
ship a kernel with the AppArmor patchset but not for everyone else.
- lxc must allow for the AppArmor userspace tools to be absent. This could be done by checking if
the binaries are present on the system or by checking for ENOENT after aa_change_profile() calls.

Felix

Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-apparmor] Bug#880502: [pkg-lxc-devel] Bug#880502: lxc: cannot start container with kernel 4.13.10

intrigeri-4
In reply to this post by Antonio Terceiro-3
Hi,

Antonio Terceiro:
> The workaround that works is using the setting in the container
> configuration:

> lxc.aa_profile = unconfined

> with disables apparmor entirely.

> I have just uploaded lxc 1:2.0.9-4 setting this for all containers. This
> is not the greatest solution, but it's also not worse than the state of
> affairs before apparmor was enabled by default in the Debian kernel: it
> was already not possible to use lxc with apparmor in Debian.

Fully agreed: top priority is to ensure AppArmor doesn't break things,
so let's disable any profile that is not ready for prime time.

Adding AppArmor confinement where we had none previously can
come later.

Cheers,
--
intrigeri

Reply | Threaded
Open this post in threaded view
|

Bug#880502: [pkg-apparmor] Bug#880502: lxc: cannot start container with kernel 4.13.10

intrigeri-4
In reply to this post by Felix Geyer-6
Hi!

Sorry for the delay, I didn't expect AppArmor to be enabled in the
kernel a week ago (I thought I would coordinate this with Ben)
and I was busy with the Reproducible Builds summit this week.

Thanks Felix & Antonio for being on top of things. I'm glad the
immediate RC issue was fixed.

Felix Geyer:
> There are two issues:

> lxc expects mount mediation to be present in AppArmor. This isn't upstream (yet) so it's missing
> from the Debian kernel too.

FYI mount mediation is upstream since some time in the 4.14 cycle.
We have it in Debian experimental (Linux 4.14.0-rc7).

But for now I've disabled it on Debian even when running Linux 4.14.
It'll be enabled at some point in the future, not sure when exactly
(#880078).

> More fundamentally lxc makes the assumption that the AppArmor userspace tools are available if
> AppArmor is active in the kernel.
> When starting a container lxc detects that AppArmor is active and tries to transition to a
> profile. This fails if the apparmor package hasn't been installed as lxc has no way to load profiles.

I believe libvirt implements the exact same logic… minus the bug.
This might provide inspiration to whoever wants to fix this bug in
LXC :)

If these bugs are not tracked upstream yet: Felix, you seem to be the
one of us with the best understanding of the problem and you know
AppArmor pretty well, so perhaps you would be the best person to
report them?

Cheers,
--
intrigeri