Bug#925275: 32bit lxc guest on 64bit host has issues on one server but not on the other with identical setup

classic Classic list List threaded Threaded
2 messages Options
RA
Reply | Threaded
Open this post in threaded view
|

Bug#925275: 32bit lxc guest on 64bit host has issues on one server but not on the other with identical setup

RA
Package: lxc
Version: 1:2.0.7-2+deb9u2

Hi.

I deployed minimal setup of latest Debian 9 (Stretch) 64-bit via netinst iso on two KVM servers from different providers. The installation of OSes was totally identical on both the servers. After that I ran these commands:

apt update
apt upgrade -> everything was already up to date
apt install lxc
lxc-create -n guest -t download -> choose Debian Jessie i386
lxc-start -n guest
lxc-attach -n guest
Alls well till this stage on both the servers as I am successfully dropped onto the guest shell:

root@guest:~#

Now if I run any command like "ps aux" on the lxc guest shell:

root@guest:~# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.3   4984  3580 ?        Ss   06:47   0:00 /sbin/init
systemd+    37  0.0  0.2   3508  2444 ?        Ss   06:47   0:00 /lib/systemd/systemd-networkd
root        40  0.0  0.2  10336  2528 ?        Ss   06:47   0:00 /lib/systemd/systemd-journald
systemd+    89  0.0  0.1   3040  1976 ?        Ss   06:47   0:00 /lib/systemd/systemd-resolved
root        92  0.0  0.1   3772  2040 pts/3    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/3 115200 38400 9600 vt102
root        93  0.0  0.1   3772  1984 pts/2    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/2 115200 38400 9600 vt102
root        94  0.0  0.1   3772  1916 pts/1    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/1 115200 38400 9600 vt102
root        95  0.0  0.2   3772  2056 pts/0    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/0 115200 38400 9600 vt102
root        96  0.0  0.1   3772  2032 console  Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 vt102
root       106  0.0  0.3   5212  3256 pts/2    Ss   06:57   0:00 /bin/bash
root       109  0.0  0.2   4548  2316 pts/2    R+   06:57   0:00 ps aux

I do get the output on both. But one server remains on the guest shell (as it should), but the other one immediately exits to the host ( root@host:~# ) after showing the output. I can see following lines in the dmesg output of server which has this issue:

[Fri Mar 22 02:42:17 2019] bash[1544] bad frame in 32bit sigreturn frame:00000000ffd7876c ip:f760a106 sp:ffd78cd0 orax:ffffffffffffffff
[Fri Mar 22 02:42:17 2019]  in libc-2.19.so[f75dc000+16e000]
[Fri Mar 22 02:42:17 2019] bash[1544] bad frame in 32bit sigreturn frame:00000000ffd7876c ip:f760a106 sp:ffd78cd0 orax:ffffffffffffffff
[Fri Mar 22 02:42:17 2019]  in libc-2.19.so[f75dc000+16e000]
[Fri Mar 22 02:42:17 2019] bash[1544] bad frame in 32bit sigreturn frame:00000000ffd7876c ip:f760a106 sp:ffd78cd0 orax:ffffffffffffffff
[Fri Mar 22 02:42:17 2019]  in libc-2.19.so[f75dc000+16e000]

As I mentioned before, everything is identical on both the servers:

LXC Version: 2.0.7
Kernel: Linux host 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64 GNU/Linux

Could it be due to different CPUs (flags) on the 2 hosts?

cat /proc/cpuinfo on working setup:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 61
model name      : Intel Core Processor (Broadwell, IBRS)
stepping        : 2
microcode       : 0x1
cpu MHz         : 2394.454
cache size      : 16384 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl xtopology pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single ssbd ibrs ibpb kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips        : 4788.90
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual

cat /proc/cpuinfo on problematic one:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
stepping        : 4
cpu MHz         : 2799.969
cache size      : 4096 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc nopl pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm kaiser xsaveopt xsavec xsaves
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips        : 5599.93
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

Thanks.

Reply | Threaded
Open this post in threaded view
|

Bug#925275: 32bit lxc guest on 64bit host has issues on one server but not on the other with identical setup

Pierre-Elliott Bécue-3
Le vendredi 22 mars 2019 à 03:38:18-0400, RA a écrit :

> Package: lxc
> Version: 1:2.0.7-2+deb9u2
>
> Hi.
>
> I deployed minimal setup of latest Debian 9 (Stretch) 64-bit via netinst iso on two KVM servers from different providers. The installation of OSes was totally identical on both the servers. After that I ran these commands:
>
> apt update
> apt upgrade -> everything was already up to date
> apt install lxc
> lxc-create -n guest -t download -> choose Debian Jessie i386
> lxc-start -n guest
> lxc-attach -n guest
> Alls well till this stage on both the servers as I am successfully dropped onto the guest shell:
>
> root@guest:~#
>
> Now if I run any command like "ps aux" on the lxc guest shell:
>
> root@guest:~# ps aux
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root         1  0.0  0.3   4984  3580 ?        Ss   06:47   0:00 /sbin/init
> systemd+    37  0.0  0.2   3508  2444 ?        Ss   06:47   0:00 /lib/systemd/systemd-networkd
> root        40  0.0  0.2  10336  2528 ?        Ss   06:47   0:00 /lib/systemd/systemd-journald
> systemd+    89  0.0  0.1   3040  1976 ?        Ss   06:47   0:00 /lib/systemd/systemd-resolved
> root        92  0.0  0.1   3772  2040 pts/3    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/3 115200 38400 9600 vt102
> root        93  0.0  0.1   3772  1984 pts/2    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/2 115200 38400 9600 vt102
> root        94  0.0  0.1   3772  1916 pts/1    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/1 115200 38400 9600 vt102
> root        95  0.0  0.2   3772  2056 pts/0    Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud pts/0 115200 38400 9600 vt102
> root        96  0.0  0.1   3772  2032 console  Ss+  06:47   0:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 vt102
> root       106  0.0  0.3   5212  3256 pts/2    Ss   06:57   0:00 /bin/bash
> root       109  0.0  0.2   4548  2316 pts/2    R+   06:57   0:00 ps aux
>
> I do get the output on both. But one server remains on the guest shell (as it should), but the other one immediately exits to the host ( root@host:~# ) after showing the output. I can see following lines in the dmesg output of server which has this issue:
>
> [Fri Mar 22 02:42:17 2019] bash[1544] bad frame in 32bit sigreturn frame:00000000ffd7876c ip:f760a106 sp:ffd78cd0 orax:ffffffffffffffff
> [Fri Mar 22 02:42:17 2019]  in libc-2.19.so[f75dc000+16e000]
> [Fri Mar 22 02:42:17 2019] bash[1544] bad frame in 32bit sigreturn frame:00000000ffd7876c ip:f760a106 sp:ffd78cd0 orax:ffffffffffffffff
> [Fri Mar 22 02:42:17 2019]  in libc-2.19.so[f75dc000+16e000]
> [Fri Mar 22 02:42:17 2019] bash[1544] bad frame in 32bit sigreturn frame:00000000ffd7876c ip:f760a106 sp:ffd78cd0 orax:ffffffffffffffff
> [Fri Mar 22 02:42:17 2019]  in libc-2.19.so[f75dc000+16e000]
>
> As I mentioned before, everything is identical on both the servers:
>
> LXC Version: 2.0.7
> Kernel: Linux host 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64 GNU/Linux
>
> Could it be due to different CPUs (flags) on the 2 hosts?
>
> cat /proc/cpuinfo on working setup:
>
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 61
> model name      : Intel Core Processor (Broadwell, IBRS)
> stepping        : 2
> microcode       : 0x1
> cpu MHz         : 2394.454
> cache size      : 16384 KB
> physical id     : 0
> siblings        : 1
> core id         : 0
> cpu cores       : 1
> apicid          : 0
> initial apicid  : 0
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 13
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl xtopology pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single ssbd ibrs ibpb kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt arat
> bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
> bogomips        : 4788.90
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 40 bits physical, 48 bits virtual
>
> cat /proc/cpuinfo on problematic one:
>
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 62
> model name      : Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
> stepping        : 4
> cpu MHz         : 2799.969
> cache size      : 4096 KB
> physical id     : 0
> siblings        : 1
> core id         : 0
> cpu cores       : 1
> apicid          : 0
> initial apicid  : 0
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 13
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc nopl pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm kaiser xsaveopt xsavec xsaves
> bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
> bogomips        : 5599.93
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 40 bits physical, 48 bits virtual
> power management:
Hi,

Thanks for your bug report.

It's actually quite possible that the differences in the CPUs
architectures are the origin of your bug.

I'm Cc-ing lxc developer Christian Brauner to get an idea of what would
be the more appropriate course of actions to find more intel on this
bug.

With best regards,

--
Pierre-Elliott Bécue
GPG: 9AE0 4D98 6400 E3B6 7528  F493 0D44 2664 1949 74E2
It's far easier to fight for one's principles than to live up to them.

signature.asc (849 bytes) Download Attachment