Use SMP kernel for Alpha (udeb) builds

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
Dear all,

As per [1] and our recent discussions the generic 4.x kernels seem to no
longer work on Alpha machines which also renders any installer images
using the generic 4.x kernels non-working.

[1]: https://lists.debian.org/debian-alpha/2017/03/msg00007.html

Confirmed on:
* AlphaStation 200 (w/EV4 x 1)
* AlphaStation 255 (w/EV45 x 1)
* Personal Workstation 500au (w/EV56 x 1)
* AlphaServer DS20E (w/EV67 x 2)

Also expected on:
* AXPpci33 (w/LCA4 x 1)
* AlphaStation 500 (w/EV56 x 1)
* AlphaServer DS25 (w/EV68CB x 2)
* AlphaServer ES45 (w/EV68CB x 4)

The following two patches should switch the used kernels to the SMP
version. As:

(1) I don't exactly know how to build images using multiple kernels
(i.e. what happens if $TEMP_KERNEL has multiple kernel names in it,
which seems to be supported according to [2], will the image creation in
e.g. [3] than run multiple times automatically?) and I don't want to
break things,

[2]:
https://salsa.debian.org/installer-team/debian-installer/blob/master/build/config/dir#L79

[3]:
https://salsa.debian.org/installer-team/debian-installer/blob/master/build/config/alpha/netboot.cfg

(2) I can't find a similar example for another architecture and

(3) the images with the generic kernels are non-working anyhow,

...I just omitted the generic ones for now.

This is sort of a workaround and does not fix the actual problem which
is yet unknown, but I believe getting working installer images is more
important at the moment. With working installer images more people could
get involved and maybe sometime in the future someone has enough time
and effort to invest in fixing the actual problem.

## Patches ##

1.
https://salsa.debian.org/frank-scheiner-guest/linux/commit/865cacfd7722b346629082ab3094b6ad93964095

2.
https://salsa.debian.org/frank-scheiner-guest/debian-installer/commit/7269679bec8bae997ef5ed7619e9f8df2e184134

I think both patches are already enough to produce the needed alpha-smp
udebs and will allow to produce working installer images (e.g. netboot
images might work instantly and could be an alternative way for Bob to
reinstall his PWS).

What do you think? Is there anything obvious missing?

Cheers,
Frank

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

John Paul Adrian Glaubitz
Hi Frank!

On 12/4/18 5:38 PM, Frank Scheiner wrote:

> As per [1] and our recent discussions the generic 4.x kernels seem to no longer work on Alpha machines which also renders any installer images using the generic 4.x kernels non-working.
>
> [1]: https://lists.debian.org/debian-alpha/2017/03/msg00007.html
>
> Confirmed on:
> * AlphaStation 200 (w/EV4 x 1)
> * AlphaStation 255 (w/EV45 x 1)
> * Personal Workstation 500au (w/EV56 x 1)
> * AlphaServer DS20E (w/EV67 x 2)
>
> Also expected on:
> * AXPpci33 (w/LCA4 x 1)
> * AlphaStation 500 (w/EV56 x 1)
> * AlphaServer DS25 (w/EV68CB x 2)
> * AlphaServer ES45 (w/EV68CB x 4)

Wow, thanks for the very extensive testing.

> This is sort of a workaround and does not fix the actual problem which is yet unknown, but I believe getting working installer images is more important at the moment. With working installer images more people could get involved and maybe sometime in the future someone has enough time and effort to invest in fixing the actual problem.

I agree.

> ## Patches ##
>
> 1. https://salsa.debian.org/frank-scheiner-guest/linux/commit/865cacfd7722b346629082ab3094b6ad93964095
>
> 2. https://salsa.debian.org/frank-scheiner-guest/debian-installer/commit/7269679bec8bae997ef5ed7619e9f8df2e184134
>
> I think both patches are already enough to produce the needed alpha-smp udebs and will allow to produce working installer images (e.g. netboot images might work instantly and could be an alternative way for Bob to reinstall his PWS).
>
> What do you think? Is there anything obvious missing?

Can you open PRs so that these changes can get merged? I will then build new images.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - [hidden email]
`. `'   Freie Universitaet Berlin - [hidden email]
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
Hi Adrian,

On 12/4/18 17:45, John Paul Adrian Glaubitz wrote:

>> ## Patches ##
>>
>> 1. https://salsa.debian.org/frank-scheiner-guest/linux/commit/865cacfd7722b346629082ab3094b6ad93964095
>>
>> 2. https://salsa.debian.org/frank-scheiner-guest/debian-installer/commit/7269679bec8bae997ef5ed7619e9f8df2e184134
>>
>> I think both patches are already enough to produce the needed alpha-smp udebs and will allow to produce working installer images (e.g. netboot images might work instantly and could be an alternative way for Bob to reinstall his PWS).
>>
>> What do you think? Is there anything obvious missing?
>
> Can you open PRs so that these changes can get merged? I will then build new images.

Sure, created them now:

* First part: https://salsa.debian.org/kernel-team/linux/merge_requests/79

* Second part:
https://salsa.debian.org/installer-team/debian-installer/merge_requests/6

Cheers,
Frank

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Bob Tracy
On Tue, Dec 04, 2018 at 07:37:13PM +0100, Frank Scheiner wrote:

> On 12/4/18 17:45, John Paul Adrian Glaubitz wrote:
> > > ## Patches ##
> > >
> > > 1. https://salsa.debian.org/frank-scheiner-guest/linux/commit/865cacfd7722b346629082ab3094b6ad93964095
> > >
> > > 2. https://salsa.debian.org/frank-scheiner-guest/debian-installer/commit/7269679bec8bae997ef5ed7619e9f8df2e184134
> > >
> > > I think both patches are already enough to produce the needed alpha-smp udebs and will allow to produce working installer images (e.g. netboot images might work instantly and could be an alternative way for Bob to reinstall his PWS).
> > >
> > > What do you think? Is there anything obvious missing?
> >
> > Can you open PRs so that these changes can get merged? I will then build new images.
>
> Sure, created them now:
>
> * First part: https://salsa.debian.org/kernel-team/linux/merge_requests/79
>
> * Second part:
> https://salsa.debian.org/installer-team/debian-installer/merge_requests/6

Much appreciated, gentlemen.  Wish I could do more than offer my system up
as a test platform, but so it goes...  I'll be happy to help with determining
the "actual problem which is yet unknown" with the alpha generic kernel, once
my system is back up and running :-).

--Bob

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
On 12/5/18 07:33, Bob Tracy wrote:
>>> Can you open PRs so that these changes can get merged? I will then build new images.
>>
>> Sure, created them now:
>>
>> * First part: https://salsa.debian.org/kernel-team/linux/merge_requests/79
>>
>> * Second part:
>> https://salsa.debian.org/installer-team/debian-installer/merge_requests/6

@all:
Unfortunately both patches weren't included in the latest (maintainer?)
commits/releases - not that I expected that ;-). I keep rebasing them
(due to constant changes to `debian/changelog`) but would it help if
people (with influence :-)) upvote these patches with that "thumbs up"
buttons?

>
> Much appreciated, gentlemen.  Wish I could do more than offer my system up
> as a test platform, but so it goes...  I'll be happy to help with determining
> the "actual problem which is yet unknown" with the alpha generic kernel, once
> my system is back up and running :-).

Hey, if someone knows how to use `kernel-wedge` manually, we could build
a netboot image right away, assuming that `kernel-wedge` can use the
existing linux-image-[...]-alpha-smp package to build the needed udebs.
That would not require a rebuild of the linux-image-[...]-alpha-smp
package and save us a lot of time.

Cheers,
Frank

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Michael Cree
In reply to this post by Frank Scheiner
On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote:
> As per [1] and our recent discussions the generic 4.x kernels seem to no
> longer work on Alpha machines which also renders any installer images using
> the generic 4.x kernels non-working.

Yes, that was noted some time ago.  A generic kernel does not boot
since about 3.13.  I can't remember why I never attempted bisecting
this back when it was first noted to be a problem, maybe because it
didn't affect me because I normally run my own spun kernels.

> Confirmed on:
> * AlphaStation 200 (w/EV4 x 1)
> * AlphaStation 255 (w/EV45 x 1)
> * Personal Workstation 500au (w/EV56 x 1)
> * AlphaServer DS20E (w/EV67 x 2)

Also on XP1000 so I would presume on any DP264 based machine.

> Also expected on:
> * AlphaServer ES45 (w/EV68CB x 4)

Actually no.  I seem to recall that the generic kernel does boot on
ES45 (Titan).  I can check that at some point when the buildds are
not busy.

I might have a look again to see if we can bisect this problem...

Cheers
Michael.

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
On 12/7/18 22:06, Michael Cree wrote:
> On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote:
>> As per [1] and our recent discussions the generic 4.x kernels seem to no
>> longer work on Alpha machines which also renders any installer images using
>> the generic 4.x kernels non-working.
>
> Yes, that was noted some time ago.  A generic kernel does not boot
> since about 3.13.

Must be after 3.16 because a Debian 3.16 generic kernel still worked on
my PWS 500au back in 2017 or even earlier.

> I can't remember why I never attempted bisecting
> this back when it was first noted to be a problem, maybe because it
> didn't affect me because I normally run my own spun kernels.

Yes, you mentioned that it doesn't affect non-generic kernels, e.g.
kernels built for specific hardware like DP264.

>
>> Confirmed on:
>> * AlphaStation 200 (w/EV4 x 1)
>> * AlphaStation 255 (w/EV45 x 1)
>> * Personal Workstation 500au (w/EV56 x 1)
>> * AlphaServer DS20E (w/EV67 x 2)
>
> Also on XP1000 so I would presume on any DP264 based machine.
>
>> Also expected on:
>> * AlphaServer ES45 (w/EV68CB x 4)
>
> Actually no.  I seem to recall that the generic kernel does boot on
> ES45 (Titan).

Interesting, maybe I should also give that a try on my DS25.

> I can check that at some point when the buildds are
> not busy.

If you want to avoid a reboot on the buildd machine, I can have a look
with my ES45 on Sunday.

Cheers,
Frank

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Michael Cree
On Fri, Dec 07, 2018 at 10:39:58PM +0100, Frank Scheiner wrote:

> On 12/7/18 22:06, Michael Cree wrote:
> > On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote:
> > > As per [1] and our recent discussions the generic 4.x kernels seem to no
> > > longer work on Alpha machines which also renders any installer images using
> > > the generic 4.x kernels non-working.
> >
> > Yes, that was noted some time ago.  A generic kernel does not boot
> > since about 3.13.
>
> Must be after 3.16 because a Debian 3.16 generic kernel still worked on my
> PWS 500au back in 2017 or even earlier.

Confirmed.  3.16 generic boots but 3.18 generic does not on XP1000.

The reason I did not bisect at the time is that there were build errors
in the kernel about 3.17 and 3.18 but I think I now know how to work
around those so shall proceed to bisect.

Cheers,
Michael.

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Bob Tracy
In reply to this post by Michael Cree
On Sat, Dec 08, 2018 at 10:06:25AM +1300, Michael Cree wrote:
> On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote:
> > As per [1] and our recent discussions the generic 4.x kernels seem to no
> > longer work on Alpha machines which also renders any installer images using
> > the generic 4.x kernels non-working.
>
> Yes, that was noted some time ago.  A generic kernel does not boot
> since about 3.13.  I can't remember why I never attempted bisecting
> this back when it was first noted to be a problem, maybe because it
> didn't affect me because I normally run my own spun kernels.

Ditto on this end.  I figure a first pass at the problem would be to
compare our respective kernel configs against the generic one, just to
get a reading on what code *may* be involved.  I can provide my Miata
config for a 4.14 kernel (and that's about all I can do until I'm back
up and running) if that would be helpful.

Another data point to consider would be the kernel config for the
current (as of the end of November) Gentoo "install-alpha-minimal" image,
which works on Miata at least (modulo the missing Qlogic firmware issue).
The associated kernel is "4.14.65-gentoo", and two variants are present
on the image -- a "generic" one, and one without a "legacy start address".
The "aboot.conf" file has the following comment:

# Some later alphas need a special kernel without legacy start address, most
# notably the DS15A and DS25 workstations as well as the ES45, ES47 and GS
# series of servers.

The Miata boots fine with the "generic" kernel, and panics when I try
the "nolsa" kernel.

Bottom line: I think the way forward will be easier from a Debian
perspective if the Debian installer for alpha includes a >= 4.14 kernel,
because the 4.8 and 4.9 kernels are known to have issues anyway.  An
upgrade would also put alpha closer to being in-sync with the "testing"
distro on Intel/AMD platforms.

--Bob

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
On 12/8/18 06:58, Bob Tracy wrote:

> On Sat, Dec 08, 2018 at 10:06:25AM +1300, Michael Cree wrote:
>> On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote:
>>> As per [1] and our recent discussions the generic 4.x kernels seem to no
>>> longer work on Alpha machines which also renders any installer images using
>>> the generic 4.x kernels non-working.
>>
>> Yes, that was noted some time ago.  A generic kernel does not boot
>> since about 3.13.  I can't remember why I never attempted bisecting
>> this back when it was first noted to be a problem, maybe because it
>> didn't affect me because I normally run my own spun kernels.
>
> Ditto on this end.  I figure a first pass at the problem would be to
> compare our respective kernel configs against the generic one, just to
> get a reading on what code *may* be involved.  I can provide my Miata
> config for a 4.14 kernel (and that's about all I can do until I'm back
> up and running) if that would be helpful.
>
> Another data point to consider would be the kernel config for the
> current (as of the end of November) Gentoo "install-alpha-minimal" image,
> which works on Miata at least (modulo the missing Qlogic firmware issue).
> The associated kernel is "4.14.65-gentoo", and two variants are present
> on the image -- a "generic" one, and one without a "legacy start address".
> The "aboot.conf" file has the following comment:
>
> # Some later alphas need a special kernel without legacy start address, most
> # notably the DS15A and DS25 workstations as well as the ES45, ES47 and GS
> # series of servers.
>
> The Miata boots fine with the "generic" kernel, and panics when I try
> the "nolsa" kernel.

Is this Gentoo generic installer kernel SMP capable? I believe these
Gentoo kernels have the config included in the kernel image, so
available as `/proc/config.gz` during runtime, I think.

>
> Bottom line: I think the way forward will be easier from a Debian
> perspective if the Debian installer for alpha includes a >= 4.14 kernel,
> because the 4.8 and 4.9 kernels are known to have issues anyway.  An
> upgrade would also put alpha closer to being in-sync with the "testing"
> distro on Intel/AMD platforms.

I think the kernel version used on the installers will be the same
version that's available as `linux-image-[...].deb` at the time of
creation, as kernel-wedge creates the udebs from the
`linux-image-[...].deb` IIUIC.

Cheers,
Frank

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Bob Tracy
On Sat, Dec 08, 2018 at 11:15:21AM +0100, Frank Scheiner wrote:
> Is this Gentoo generic installer kernel SMP capable? I believe these Gentoo
> kernels have the config included in the kernel image, so available as
> `/proc/config.gz` during runtime, I think.

>From the "image.squashfs" file on the Gentoo "install-alpha-minimal"
image, attached is "etc/kernels/kernel-config-alpha-4.14.65-gentoo"
which appears to correspond to the "nolsa" kernel variant.  To your
question about whether SMP is configured, most definitely "yes" with
CONFIG_NR_CPUS=32.

--Bob

kernel-config-alpha-4.14.65-gentoo (68K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
On 12/8/18 15:05, Bob Tracy wrote:

> On Sat, Dec 08, 2018 at 11:15:21AM +0100, Frank Scheiner wrote:
>> Is this Gentoo generic installer kernel SMP capable? I believe these Gentoo
>> kernels have the config included in the kernel image, so available as
>> `/proc/config.gz` during runtime, I think.
>
>>From the "image.squashfs" file on the Gentoo "install-alpha-minimal"
> image, attached is "etc/kernels/kernel-config-alpha-4.14.65-gentoo"
> which appears to correspond to the "nolsa" kernel variant.  To your
> question about whether SMP is configured, most definitely "yes" with
> CONFIG_NR_CPUS=32.
Thanks for checking. This seems to be definitely a SMP capable kernel,
as `CONFIG_SMP=y` is also set.

About the `CONFIG_ALPHA_LEGACY_START_ADDRESS`, [1] mentions this is
actually needed for older boot loaders only which hardcoded the kernel
start address. And the Gentoo config shows it as inactive: `#
CONFIG_ALPHA_LEGACY_START_ADDRESS is not set`

[1]: https://cateee.net/lkddb/web-lkddb/ALPHA_LEGACY_START_ADDRESS.html

But interesting, [1] also says, that this option depends on
CONFIG_ALPHA_GENERIC, which is actually set (`CONFIG_ALPHA_GENERIC=y`)
in the Gentoo config.

So can we assume `CONFIG_ALPHA_GENERIC=y` also activates
`CONFIG_ALPHA_LEGACY_START_ADDRESS`?

If yes this could correspond to the behaviour of the generic Debian
kernel on my DS25. I just tested a `netabootwrap`ped
`4.18.0-2-alpha-generic` and after aboot emits the "starting kernel
[...]" message nothing happens:

```
 >>>boot
(boot ega0.0.0.5.2 -flags root=/dev/nfs ip=:::::enP2p2s5:dhcp
console=ttyS0,9600n8)

Trying BOOTP boot.

Broadcasting BOOTP Request...
Received BOOTP Packet File Name is: /AC100259
local inet address: 172.16.2.89
remote inet address: 172.16.0.2
TFTP Read File Name: /AC100259
netmask = 255.255.0.0
Server is on same subnet as client.
block number= 0 port_number= 35092
................................................................
................................................................
.................
bootstrap code read in
base = 39c000, image_start = 0, image_bytes = 90b86c(9484396)
initializing HWRPB at 2000
initializing page table at ffff0000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code
aboot: Linux/Alpha SRM bootloader version 1.0_pre20040408
aboot: switching to OSF/1 PALcode version 1.92
aboot: loading initrd (4874860 bytes/9522 blocks) at 0xfffffc00ffb46000
aboot: starting kernel network with arguments root=/dev/nfs
ip=:::::enP2p2s5:dhcp console=ttyS0,9600n8
```

And as [1] says, the SRM firmware of Titan machines is bigger than on
older Alpha machines, so the kernel start address for the generic kernel
might have ended somewhere inside the SRM. I'll check that with my ES45,
too.

The same kernel leads to:
```
CPU 0 booting

(boot ewa0.0.0.3.0 -flags root=/dev/nfs ip=dhcp console=tty1
console=ttyS0,9600n8)

Trying BOOTP boot.

Broadcasting BOOTP Request...
.Received BOOTP Packet File Name is: /AC10020F
local inet address: 172.16.2.15
remote inet address: 172.16.0.2
TFTP Read File Name: /AC10020F
netmask = 255.255.0.0
Server is on same subnet as client.
.................................................................................................................................................
bootstrap code read in
base = 1e6000, image_start = 0, image_bytes = 90b86c
initializing HWRPB at 2000
initializing page table at 1d8000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code
aboot: Linux/Alpha SRM bootloader version 1.0_pre20040408
aboot: switching to OSF/1 PALcode version 1.22
aboot: loading initrd (4874860 bytes/9522 blocks) at 0xfffffc0023b56000
aboot: starting kernel network with arguments root=/dev/nfs ip=dhcp
console=tty1 console=ttyS0,9600n8

halted CPU 0

halt code = 6
double error halt
PC = fffffc000107f868
boot failure
```
...on my PWS 500au. I hence assume, the SRM is small enough on this
machine, so the kernel start address doesn't end up in the SRM.

The SMP kernel boots without an issue on both machines.

But strange, the kernel configuration files for both
`4.18.0-2-alpha-generic` and `4.18.0-2-alpha-smp` contain:

```
# grep -n CONFIG_ALPHA_GENERIC config-4.18.0-2-alpha-generic
config-4.18.0-2-alpha-smp
config-4.18.0-2-alpha-generic:288:CONFIG_ALPHA_GENERIC=y
config-4.18.0-2-alpha-smp:296:CONFIG_ALPHA_GENERIC=y
```

So shouldn't this setting then not also imply
`CONFIG_ALPHA_LEGACY_START_ADDRESS` active for both kernels (so also for
the SMP kernel)?

But maybe some other active/inactive option in the SMP kernel remedies
the dependent `CONFIG_ALPHA_LEGACY_START_ADDRESS`. A unified diff
between both configurations is attached.

Oh btw, the generic config also has "CONFIG_BROKEN_ON_SMP=y" but I am
not sure what this means. [2] mentions this is sort of attached to
drivers unsafe on SMPs. But then I'd actually expect that setting to be
active for the SMP config.

[2]:
https://lists.kernelnewbies.org/pipermail/kernelnewbies/2014-January/009660.html

Cheers,
Frank

P.S.
Unfortunately my patches aren't merged yet and I already had to rebase
one of them two times today (making a total of four rebases for this one
patch already).

config-alpha-generic-alpha-smp.diff (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Michael Cree
In reply to this post by Michael Cree
On Sat, Dec 08, 2018 at 12:01:23PM +1300, Michael Cree wrote:

> On Fri, Dec 07, 2018 at 10:39:58PM +0100, Frank Scheiner wrote:
> > On 12/7/18 22:06, Michael Cree wrote:
> > > On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote:
> > > > As per [1] and our recent discussions the generic 4.x kernels seem to no
> > > > longer work on Alpha machines which also renders any installer images using
> > > > the generic 4.x kernels non-working.
> > >
> > > Yes, that was noted some time ago.  A generic kernel does not boot
> > > since about 3.13.
> >
> > Must be after 3.16 because a Debian 3.16 generic kernel still worked on my
> > PWS 500au back in 2017 or even earlier.
>
> Confirmed.  3.16 generic boots but 3.18 generic does not on XP1000.
>
> The reason I did not bisect at the time is that there were build errors
> in the kernel about 3.17 and 3.18 but I think I now know how to work
> around those so shall proceed to bisect.

Bisection leads to:

dca496451bddea9aa87b7510dc2eb413d1a19dfd is the first bad commit
commit dca496451bddea9aa87b7510dc2eb413d1a19dfd
Author: Tejun Heo <[hidden email]>
Date:   Tue Sep 2 14:46:01 2014 -0400

    percpu: move common parts out of pcpu_[de]populate_chunk()
       

I have no idea why that commit is a problem...

Cheers,
Michael.

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Michael Cree
On Sun, Dec 09, 2018 at 07:54:52AM +1300, Michael Cree wrote:

> On Sat, Dec 08, 2018 at 12:01:23PM +1300, Michael Cree wrote:
> > On Fri, Dec 07, 2018 at 10:39:58PM +0100, Frank Scheiner wrote:
> > > On 12/7/18 22:06, Michael Cree wrote:
> > > > On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote:
> > > > > As per [1] and our recent discussions the generic 4.x kernels seem to no
> > > > > longer work on Alpha machines which also renders any installer images using
> > > > > the generic 4.x kernels non-working.
> > > >
> > > > Yes, that was noted some time ago.  A generic kernel does not boot
> > > > since about 3.13.
> > >
> > > Must be after 3.16 because a Debian 3.16 generic kernel still worked on my
> > > PWS 500au back in 2017 or even earlier.
> >
> > Confirmed.  3.16 generic boots but 3.18 generic does not on XP1000.
> >
> > The reason I did not bisect at the time is that there were build errors
> > in the kernel about 3.17 and 3.18 but I think I now know how to work
> > around those so shall proceed to bisect.
>
> Bisection leads to:
>
> dca496451bddea9aa87b7510dc2eb413d1a19dfd is the first bad commit

Actually I am not so sure about that.  It appears that sometimes a
bad kernel can boot which might have lead me astray.  That commit
after failing once (assuming I did not make a mistake in the
bisection) is now booting...

Cheers,
Michael.

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Bob Tracy
In reply to this post by Frank Scheiner
On Sat, Dec 08, 2018 at 07:41:15PM +0100, Frank Scheiner wrote:

> On 12/8/18 15:05, Bob Tracy wrote:
> > From the "image.squashfs" file on the Gentoo "install-alpha-minimal"
> > image, attached is "etc/kernels/kernel-config-alpha-4.14.65-gentoo"
> > which appears to correspond to the "nolsa" kernel variant.  To your
> > question about whether SMP is configured, most definitely "yes" with
> > CONFIG_NR_CPUS=32.
>
> Thanks for checking. This seems to be definitely a SMP capable kernel, as
> `CONFIG_SMP=y` is also set.
>
> About the `CONFIG_ALPHA_LEGACY_START_ADDRESS`, [1] mentions this is actually
> needed for older boot loaders only which hardcoded the kernel start address.
> And the Gentoo config shows it as inactive: `#
> CONFIG_ALPHA_LEGACY_START_ADDRESS is not set`
>
> [1]: https://cateee.net/lkddb/web-lkddb/ALPHA_LEGACY_START_ADDRESS.html
>
> But interesting, [1] also says, that this option depends on
> CONFIG_ALPHA_GENERIC, which is actually set (`CONFIG_ALPHA_GENERIC=y`) in
> the Gentoo config.
>
> So can we assume `CONFIG_ALPHA_GENERIC=y` also activates
> `CONFIG_ALPHA_LEGACY_START_ADDRESS`?

I wouldn't assume so, particularly for the Gentoo kernel source tree to
whatever extent it differs from the kernel.org source tree.

What the dependency is saying is, you can't have the legacy start address
config option force-enabled unless you're building a generic kernel.
Otherwise, the (alpha) processor-specific config options presumably
dictate whether the legacy start address is used.  This is, I think,
why Gentoo includes a generic+lsa kernel and a generic+nolsa kernel in
their install image.

BUT, in your defense, it's possible an unpatched kernel.org source tree
might be doing (or might have done -- this could have been patched upstream)
exactly as you suggest.  I haven't investigated this, because I've never used
the alpha generic kernel except for the initial installation on a system.

Just to be clear, Gentoo's generic kernel *does* have SMP configured, and
*with* the legacy start address enabled should boot just fine on your PWS
as it does on mine.  The kernel version is 4.14(.65).

--Bob

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Philippe Mathieu-Daudé
In reply to this post by Frank Scheiner
Hi Frank,

On 12/4/18 5:38 PM, Frank Scheiner wrote:

> Dear all,
>
> As per [1] and our recent discussions the generic 4.x kernels seem to no
> longer work on Alpha machines which also renders any installer images
> using the generic 4.x kernels non-working.
>
> [1]: https://lists.debian.org/debian-alpha/2017/03/msg00007.html
>
> Confirmed on:
> * AlphaStation 200 (w/EV4 x 1)
> * AlphaStation 255 (w/EV45 x 1)
> * Personal Workstation 500au (w/EV56 x 1)
> * AlphaServer DS20E (w/EV67 x 2)
>
> Also expected on:
> * AXPpci33 (w/LCA4 x 1)
> * AlphaStation 500 (w/EV56 x 1)
> * AlphaServer DS25 (w/EV68CB x 2)
> * AlphaServer ES45 (w/EV68CB x 4)
>
> The following two patches should switch the used kernels to the SMP
> version. As:
>
> (1) I don't exactly know how to build images using multiple kernels
> (i.e. what happens if $TEMP_KERNEL has multiple kernel names in it,
> which seems to be supported according to [2], will the image creation in
> e.g. [3] than run multiple times automatically?) and I don't want to
> break things,
>
> [2]:
> https://salsa.debian.org/installer-team/debian-installer/blob/master/build/config/dir#L79
>
>
> [3]:
> https://salsa.debian.org/installer-team/debian-installer/blob/master/build/config/alpha/netboot.cfg
>
>
> (2) I can't find a similar example for another architecture and
>
> (3) the images with the generic kernels are non-working anyhow,
>
> ...I just omitted the generic ones for now.
>
> This is sort of a workaround and does not fix the actual problem which
> is yet unknown, but I believe getting working installer images is more
> important at the moment. With working installer images more people could
> get involved and maybe sometime in the future someone has enough time
> and effort to invest in fixing the actual problem.
>
> ## Patches ##
>
> 1.
> https://salsa.debian.org/frank-scheiner-guest/linux/commit/865cacfd7722b346629082ab3094b6ad93964095
>
>
> 2.
> https://salsa.debian.org/frank-scheiner-guest/debian-installer/commit/7269679bec8bae997ef5ed7619e9f8df2e184134
>
>
> I think both patches are already enough to produce the needed alpha-smp
> udebs and will allow to produce working installer images (e.g. netboot
> images might work instantly and could be an alternative way for Bob to
> reinstall his PWS).
>
> What do you think? Is there anything obvious missing?

FYI I've added few tests to QEMU to avoid regressions, one is booting
the DP264 machine (not yet merged, the specific test is here:)
https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg03082.html

I tested a recent Debian SMP kernel and got:

alpha-softmmu/qemu-system-alpha \
  -kernel vmlinuz-4.18.0-3-alpha-generic \
  -append console=srm -initrd initrd.gz \
  -nographic -net nic -net user -d mmu,unimp \
  -drive file=debian-503-alpha-businesscard.iso,if=ide,media=cdrom
PCI: 00:00:0 class 0300 id 1013:00b8
PCI:   region 0: 10000000
PCI:   region 1: 12000000
PCI: 00:01:0 class 0200 id 8086:100e
PCI:   region 0: 12020000
PCI:   region 1: 0000c000
PCI: 00:02:0 class 0101 id 1095:0646
PCI:   region 0: 0000c040
PCI:   region 1: 0000c048
PCI:   region 3: 0000c04c
[    0.000000] Linux version 4.18.0-3-alpha-generic
([hidden email]) (gcc version 7.3.0 (Debian 7.3.0-30))
#1 Debian 4.18.20-2 (2018-11-23)
[    0.000000] bootconsole [srm0] enabled
[    0.000000] Booting GENERIC on Tsunami variation Clipper using
machine vector Clipper from SRM
[    0.000000] Major Options: MAGIC_SYSRQ
[    0.000000] Command line: console=srm
[    0.000000] memcluster 0, usage 1, start        0, end       14
[    0.000000] memcluster 1, usage 0, start       14, end    16384
[    0.000000] freeing pages 14:2048
[    0.000000] freeing pages 4332:16384
[    0.000000] reserving pages 4332:4333
[    0.000000] Initial ramdisk at: 0x(____ptrval____) (5079886 bytes)
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 16256
[    0.000000] Kernel command line: console=srm
[    0.000000] Dentry cache hash table entries: 16384 (order: 4, 131072
bytes)
[    0.000000] Inode-cache hash table entries: 8192 (order: 3, 65536 bytes)
[    0.000000] Sorting __ex_table...
[    0.000000] Memory: 106304K/131072K available (6642K kernel code,
8709K rwdata, 2080K rodata, 352K init, 393K bss, 24768K reserved, 0K
cma-reserved)
[    0.000000] random: get_random_u64 called from
__kmem_cache_create+0x5c/0x620 with crng_init=0
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] NR_IRQS: 32784
[    0.000000] clocksource: qemu: mask: 0xffffffffffffffff max_cycles:
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.003906] ------------[ cut here ]------------
[    0.004882] WARNING: CPU: 0 PID: 0 at
/build/linux-kQe68U/linux-4.18.20/init/main.c:650 start_kernel+0x4dc/0x754
[    0.004882] Interrupts were enabled early
[    0.004882] Modules linked in:
[    0.005859] CPU: 0 PID: 0 Comm: swapper Not tainted
4.18.0-3-alpha-generic #1 Debian 4.18.20-2
[    0.006835]        fffffc00018f3dc8 fffffc000216ee70 fffffc000103597c
fffffc0001898ddc
[    0.007812]        fffffc00010359f4 fffffc00018ce1b0 fffffc0002171704
fffffc000216ee70
[    0.007812]        fffffc000216ee70 0000000000000000 000000000000028a
fffffc0001898ddc
[    0.007812]        fffffc0001898ddc 0000000000000000 fffffc000173e371
fffffc00018f3e88
[    0.008789]        fffffc0000000018 fffffc000216ee70 0000000000000000
0000000000000001
[    0.008789]        fffffc00018acab8 0000000000000001 0000000000000000
0000000000000000
[    0.008789] Trace:
[    0.009765] [<fffffc000103597c>] __warn+0x15c/0x180
[    0.009765] [<fffffc00010359f4>] warn_slowpath_fmt+0x54/0x70
[    0.009765] [<fffffc000101001c>] _stext+0x1c/0x20
[    0.009765] [<fffffc0001010000>] _stext+0x0/0x20
[    0.010742]
[    0.010742] ---[ end trace c85a0517f87d04be ]---
[    0.022460] Console: colour VGA+ 80x25
[    0.025390] Calibrating delay loop... 518.32 BogoMIPS (lpj=252928)
[    0.046874] pid_max: default: 32768 minimum: 301
[    0.049804] Security Framework initialized
[    0.050781] Yama: disabled by default; enable with sysctl kernel.yama.*
[    0.059570] AppArmor: AppArmor initialized
[    0.061523] Mount-cache hash table entries: 1024 (order: 0, 8192 bytes)
[    0.061523] Mountpoint-cache hash table entries: 1024 (order: 0, 8192
bytes)
[    0.124999] Performance events: Supported CPU type!
[    0.155273] devtmpfs: initialized
[    0.175781] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 1866466235866741 ns
[    0.175781] futex hash table entries: 256 (order: -1, 6144 bytes)
[    0.200195] NET: Registered protocol family 16
[    0.207031] audit: initializing netlink subsys (disabled)
[    0.216796] EISA bus registered
[    0.222656] PCI host bridge to bus 0000:00
[    0.223632] pci_bus 0000:00: root bus resource [io  0x0000-0x1ffffff]
[    0.224609] pci_bus 0000:00: root bus resource [mem
0x00000000-0x3fffffff]
[    0.225585] pci_bus 0000:00: No busn resource found for root bus,
will use [bus 00-ff]
[    0.249999] random: fast init done
[    0.250976] pci: enabling save/restore of SRM state
[    0.256835] pci 0000:00:00.0: BAR 0: assigned [mem
0x0a000000-0x0bffffff pref]
[    0.258788] pci 0000:00:01.0: BAR 6: assigned [mem
0x09000000-0x0903ffff pref]
[    0.258788] pci 0000:00:01.0: BAR 0: assigned [mem 0x09040000-0x0905ffff]
[    0.259765] pci 0000:00:00.0: BAR 6: assigned [mem
0x09060000-0x0906ffff pref]
[    0.260742] pci 0000:00:00.0: BAR 1: assigned [mem 0x09070000-0x09070fff]
[    0.260742] pci 0000:00:01.0: BAR 1: assigned [io  0x8000-0x803f]
[    0.262695] pci 0000:00:02.0: BAR 4: assigned [io  0x8040-0x804f]
[    0.262695] pci 0000:00:02.0: BAR 0: assigned [io  0x8050-0x8057]
[    0.263671] pci 0000:00:02.0: BAR 2: assigned [io  0x8058-0x805f]
[    0.264648] pci 0000:00:02.0: BAR 1: assigned [io  0x8060-0x8063]
[    0.265624] pci 0000:00:02.0: BAR 3: assigned [io  0x8064-0x8067]
[    0.274413] Console: switching to colour VGA+ 80x25
[    0.276367] audit: type=2000 audit(0.209:1): state=initialized
audit_enabled=0 res=1
[    0.320312] pci 0000:00:00.0: vgaarb: setting as boot VGA device
[    0.320312] pci 0000:00:00.0: vgaarb: VGA device added:
decodes=io+mem,owns=io+mem,locks=none
[    0.321288] pci 0000:00:00.0: vgaarb: bridge control possible
[    0.321288] vgaarb: loaded
[    0.324218] pps_core: LinuxPPS API ver. 1 registered
[    0.325195] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
Rodolfo Giometti <[hidden email]>
[    0.325195] PTP clock support registered
[    0.351562] clocksource: Switched to clocksource qemu
[    0.357421] VFS: Disk quotas dquot_6.6.0
[    0.358398] VFS: Dquot-cache hash table entries: 1024 (order 0, 8192
bytes)
[    0.368163] AppArmor: AppArmor Filesystem Enabled
[    0.412109] NET: Registered protocol family 2
[    0.427734] tcp_listen_portaddr_hash hash table entries: 512 (order:
0, 8192 bytes)
[    0.428710] TCP established hash table entries: 1024 (order: 0, 8192
bytes)
[    0.428710] TCP bind hash table entries: 1024 (order: 0, 8192 bytes)
[    0.428710] TCP: Hash tables configured (established 1024 bind 1024)
[    0.432616] UDP hash table entries: 256 (order: 0, 8192 bytes)
[    0.433593] UDP-Lite hash table entries: 256 (order: 0, 8192 bytes)
[    0.439452] NET: Registered protocol family 1
[    0.453124] Unpacking initramfs...
[    0.874999] Freeing initrd memory: 4960K
[    0.876952] Using epoch 2000 for rtc year 18
[    0.881835] platform rtc-alpha: rtc core: registered rtc-alpha as rtc0
[    0.889647] Initialise system trusted keyrings
[    0.893554] workingset: timestamp_bits=46 max_order=14 bucket_order=0
[    0.913085] zbud: loaded
[    1.664061] Key type asymmetric registered
[    1.664061] Asymmetric key parser 'x509' registered
[    1.665038] Block layer SCSI generic (bsg) driver version 0.4 loaded
(major 248)
[    1.666991] io scheduler noop registered
[    1.667967] io scheduler deadline registered
[    1.668944] io scheduler cfq registered (default)
[    1.668944] io scheduler mq-deadline registered
[    1.672850] isapnp: Scanning for PnP cards...
[    2.046873] isapnp: No Plug & Play device found
[    2.048827] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    2.075194] serial8250: ttyS0 at I/O 0x3f8 (irq = 4, base_baud =
115200) is a 16550A
[    2.103514] serial8250: ttyS1 at I/O 0x2f8 (irq = 3, base_baud =
115200) is a 16550A
[    2.107420] Linux agpgart interface v0.103
[    2.120116] serio: i8042 KBD port at 0x60,0x64 irq 1
[    2.121092] serio: i8042 AUX port at 0x60,0x64 irq 12
[    2.125975] mousedev: PS/2 mouse device common for all mice
[    2.127928] ledtrig-cpu: registered to indicate activity on CPUs
[    2.131834] NET: Registered protocol family 10
[    2.264647] input: AT Translated Set 2 keyboard as
/devices/platform/i8042/serio0/input/input0
[...]

Maybe the warning at init/main.c:650 is useful for your real hw?

Regards,

Phil.

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
Hi Philippe,

On 12/9/18 19:12, Philippe Mathieu-Daudé wrote:
> FYI I've added few tests to QEMU to avoid regressions, one is booting
> the DP264 machine (not yet merged, the specific test is here:)

Wow, didn't knew Alpha emulation is already that good with QEMU!

> https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg03082.html
>
> I tested a recent Debian SMP kernel and got:
>
> alpha-softmmu/qemu-system-alpha \
>    -kernel vmlinuz-4.18.0-3-alpha-generic \

This is the non-SMP kernel, but I assume you meant that actually.

>    -append console=srm -initrd initrd.gz \
>    -nographic -net nic -net user -d mmu,unimp \
>    -drive file=debian-503-alpha-businesscard.iso,if=ide,media=cdrom
> PCI: 00:00:0 class 0300 id 1013:00b8
> PCI:   region 0: 10000000
> PCI:   region 1: 12000000
> PCI: 00:01:0 class 0200 id 8086:100e
> PCI:   region 0: 12020000
> PCI:   region 1: 0000c000
> PCI: 00:02:0 class 0101 id 1095:0646
> PCI:   region 0: 0000c040
> PCI:   region 1: 0000c048
> PCI:   region 3: 0000c04c
> [    0.000000] Linux version 4.18.0-3-alpha-generic
> ([hidden email]) (gcc version 7.3.0 (Debian 7.3.0-30))
> #1 Debian 4.18.20-2 (2018-11-23)
> [    0.000000] bootconsole [srm0] enabled
> [    0.000000] Booting GENERIC on Tsunami variation Clipper using
> machine vector Clipper from SRM
> [...]
> [    0.003906] ------------[ cut here ]------------
> [    0.004882] WARNING: CPU: 0 PID: 0 at
> /build/linux-kQe68U/linux-4.18.20/init/main.c:650 start_kernel+0x4dc/0x754
> [    0.004882] Interrupts were enabled early
> [    0.004882] Modules linked in:
> [    0.005859] CPU: 0 PID: 0 Comm: swapper Not tainted
> 4.18.0-3-alpha-generic #1 Debian 4.18.20-2
> [    0.006835]        fffffc00018f3dc8 fffffc000216ee70 fffffc000103597c
> fffffc0001898ddc
> [    0.007812]        fffffc00010359f4 fffffc00018ce1b0 fffffc0002171704
> fffffc000216ee70
> [    0.007812]        fffffc000216ee70 0000000000000000 000000000000028a
> fffffc0001898ddc
> [    0.007812]        fffffc0001898ddc 0000000000000000 fffffc000173e371
> fffffc00018f3e88
> [    0.008789]        fffffc0000000018 fffffc000216ee70 0000000000000000
> 0000000000000001
> [    0.008789]        fffffc00018acab8 0000000000000001 0000000000000000
> 0000000000000000
> [    0.008789] Trace:
> [    0.009765] [<fffffc000103597c>] __warn+0x15c/0x180
> [    0.009765] [<fffffc00010359f4>] warn_slowpath_fmt+0x54/0x70
> [    0.009765] [<fffffc000101001c>] _stext+0x1c/0x20
> [    0.009765] [<fffffc0001010000>] _stext+0x0/0x20
> [    0.010742]
> [    0.010742] ---[ end trace c85a0517f87d04be ]---
> [...]
> [    2.127928] ledtrig-cpu: registered to indicate activity on CPUs
> [    2.131834] NET: Registered protocol family 10
> [    2.264647] input: AT Translated Set 2 keyboard as
> /devices/platform/i8042/serio0/input/input0
> [...]
>
> Maybe the warning at init/main.c:650 is useful for your real hw?

Maybe Michael and Bob can make something out of this. But a problem is,
that we actually don't get that far on real hardware with the non-SMP
kernel. All machines I tested so far either (1) fall back to SRM or (2)
seem to hang (DS25/ES45 and most likely other Titan based systems) after
aboot starts the kernel.

Cheers,
Frank

P.S.
BTW, that Avocado stuff looks interesting. I wonder if that could also
be used to verify (Linux kernel) bootups on real hardware. That would
definitely ease up testing new kernel versions.

Reply | Threaded
Open this post in threaded view
|

Re: Use SMP kernel for Alpha (udeb) builds

Frank Scheiner
In reply to this post by Bob Tracy
Hi Bob, Michael,

On 12/8/18 21:03, Bob Tracy wrote:

> On Sat, Dec 08, 2018 at 07:41:15PM +0100, Frank Scheiner wrote:
>> On 12/8/18 15:05, Bob Tracy wrote:
>> So can we assume `CONFIG_ALPHA_GENERIC=y` also activates
>> `CONFIG_ALPHA_LEGACY_START_ADDRESS`?
>
> I wouldn't assume so, particularly for the Gentoo kernel source tree to
> whatever extent it differs from the kernel.org source tree.
>
> What the dependency is saying is, you can't have the legacy start address
> config option force-enabled unless you're building a generic kernel.

Thanks for the clarification.

> Otherwise, the (alpha) processor-specific config options presumably
> dictate whether the legacy start address is used.  This is, I think,
> why Gentoo includes a generic+lsa kernel and a generic+nolsa kernel in
> their install image.

Not helpful for our problem, but say, does the generic+nolsa kernel also
boot on your PWS? I'd actually expect it to work.

Because if that lsa is really only needed for older bootloaders (as
mentioned on [1]), using a nolsa kernel on an older Alpha with a current
bootloader shouldn't be a problem.

[1]: https://cateee.net/lkddb/web-lkddb/ALPHA_LEGACY_START_ADDRESS.html

But then I don't understand why Gentoo "today" still needs two different
kernels.

> Just to be clear, Gentoo's generic kernel *does* have SMP configured, and
> *with* the legacy start address enabled should boot just fine on your PWS
> as it does on mine.

Yes, I expect that, too. But if SMP support really has a play in our
problem, than the Gentoo kernels (being both SMP capable) cannot provide
"new" information for our problem.

BTW, the patches applied by Gentoo for a slightly newer kernel (4.14.72)
are available on [2].

[2]: https://dev.gentoo.org/~mpagano/genpatches/patches-4.14-72.html

> The kernel version is 4.14(.65).

I was missing the time yesterday, so tested the Debian generic kernel on
my ES45 today. it behaves like the DS25, i.e. it seems to hang after
aboot starts the kernel:

As I'm still missing a "MMJ to whatever" adapter I "copied" this from a
glass console:
```
[...]
bootstrap code read in
base = 2fc000, image_start = 0, image_bytes = 90b86c(9484396)
initializing HWRPB at 2000
initializing page table at ffff0000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code
aboot: Linux/Alpha SRM bootloader version 1.0_pre20040408
aboot: switching to OSF/1 PALcode version 1.92
aboot: loading initrd (4874860 bytes/9522 blocks) at 0xfffffc00ffb46000
aboot: starting kernel network with arguments root=/dev/nfs
ip=:::::eth0:dhcp console=tty1 net.ifnames=0 biosdevname=0
```

Another thing to note: Pushing (and releasing) the halt button on the
ES45's OCP has no effect afterwards (I haven't yet checked that on my
DS25). Doing the same when the Debian SMP kernel has started and the OS
is running returns me immediately to the SRM prompt. So this mechanism
seems to be broken by loading the Debian generic kernel.

Cheers,
Frank