Re: Bug#913709: boost1.67: intermitent FTBFS on mips64el: build hangs

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Bug#913709: boost1.67: intermitent FTBFS on mips64el: build hangs

Giovanni Mascellani-3
Hi,

Il 14/11/18 09:56, Emilio Pozuelo Monfort ha scritto:

> Your package fails to build quite often on mips64el, where the build gets
> killed due to inactivity:
>
> Cannot find class named 'action'
> Cannot find class named 'action'
> Cannot find class named 'file-target'
> Cannot find class named 'generator'
> Cannot find class named 'generator'
> Cannot find class named 'std::bad_cast'
> E: Build killed with signal TERM after 150 minutes of inactivity
>
> This  may be due to an actual hang, or something just taking so long
> that causes the build to get killed.
Thanks for bringing this up. Actually, I was concerned about the same
thing, but I do not really know what is the way forward here.

The "Cannot find class" messages are harmless: they are produced on all
architectures and are not fatal. It is not a compiler that produces
them, but a documentation postprocessor. So the worse that can happen is
that some internal links in the documentation are broken or ignored.

Looking at [1] and comparing with [2] it seems that the compilation
takes much longer when compiled on a "Loongson 3A" machine then when
compiled on a "Cavium Octeon III" machine. MIPS porters, is this
sensible? The longer compilation time (apparently ~8.5h vs ~4.75h)
triggers the build node timeout. However this is probably a close call,
because version 1.67.0-6 managed to finish even when building on a
weaker machine.

 [1] https://buildd.debian.org/status/logs.php?pkg=boost1.67&arch=mips64el
 [2]
https://wiki.debian.org/MIPSPort?action=show&redirect=mips64el#Build_daemons_.26_porter_boxes

I am not sure of what is the way forward here: can larger packages, like
boost, be forced to compile on stronger machines? Or can the timeout be
raised for larger packages? Or is there anything that can I due within
the package compilation script to avoid triggering timeout? (although
this last road seems to be risky, as it might then prevent triggering
the timeout for an actually stuck build process). In line of principle
even the current situation can somewhat be tolerated, since it is enough
to reschedule the build until it gets a strong machine. However, that
does not seem optimal.

Thanks, Giovanni.
--
Giovanni Mascellani <[hidden email]>
Postdoc researcher - Université Libre de Bruxelles


signature.asc (235 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Bug#913709: boost1.67: intermitent FTBFS on mips64el: build hangs

Emilio Pozuelo Monfort-4
Hi Giovanni,

On 14/11/2018 10:29, Giovanni Mascellani wrote:

> Hi,
>
> Il 14/11/18 09:56, Emilio Pozuelo Monfort ha scritto:
>> Your package fails to build quite often on mips64el, where the build gets
>> killed due to inactivity:
>>
>> Cannot find class named 'action'
>> Cannot find class named 'action'
>> Cannot find class named 'file-target'
>> Cannot find class named 'generator'
>> Cannot find class named 'generator'
>> Cannot find class named 'std::bad_cast'
>> E: Build killed with signal TERM after 150 minutes of inactivity
>>
>> This  may be due to an actual hang, or something just taking so long
>> that causes the build to get killed.
>
> Thanks for bringing this up. Actually, I was concerned about the same
> thing, but I do not really know what is the way forward here.
>
> The "Cannot find class" messages are harmless: they are produced on all
> architectures and are not fatal. It is not a compiler that produces
> them, but a documentation postprocessor. So the worse that can happen is
> that some internal links in the documentation are broken or ignored.
>
> Looking at [1] and comparing with [2] it seems that the compilation
> takes much longer when compiled on a "Loongson 3A" machine then when
> compiled on a "Cavium Octeon III" machine. MIPS porters, is this
> sensible? The longer compilation time (apparently ~8.5h vs ~4.75h)
> triggers the build node timeout. However this is probably a close call,
> because version 1.67.0-6 managed to finish even when building on a
> weaker machine.
>
>  [1] https://buildd.debian.org/status/logs.php?pkg=boost1.67&arch=mips64el
>  [2]
> https://wiki.debian.org/MIPSPort?action=show&redirect=mips64el#Build_daemons_.26_porter_boxes
>
> I am not sure of what is the way forward here: can larger packages, like
> boost, be forced to compile on stronger machines? Or can the timeout be
> raised for larger packages? Or is there anything that can I due within
> the package compilation script to avoid triggering timeout? (although
> this last road seems to be risky, as it might then prevent triggering
> the timeout for an actually stuck build process). In line of principle
> even the current situation can somewhat be tolerated, since it is enough
> to reschedule the build until it gets a strong machine. However, that
> does not seem optimal.

Yeah, it's not.

Some questions which may help solve this:

- What is happening when the build hangs? Is xsltproc still running, just being
too slow / taking way too long?
- Note that the inactivity timeout gets triggered only if there are no new lines
printed in the timeout period. So can xsltproc or whatever is getting killed be
made more verbose?
- Is this just generating documentation? The -doc package is architecture: all,
can't we just skip building the docs on binary-arch builds?

Cheers,
Emilio

Reply | Threaded
Open this post in threaded view
|

Re: Bug#913709: boost1.67: intermitent FTBFS on mips64el: build hangs

Giovanni Mascellani-3
Hi,

Il 14/11/18 18:16, Emilio Pozuelo Monfort ha scritto:
> Some questions which may help solve this:
>
> - What is happening when the build hangs? Is xsltproc still running, just being
> too slow / taking way too long?

I believe so, since apparently, even on the slow builders, given enough
time it completes successfully. I haven't checked this first-hand, though.

> - Note that the inactivity timeout gets triggered only if there are no new lines
> printed in the timeout period. So can xsltproc or whatever is getting killed be
> made more verbose?

I can try. But probably the third one is the most important point.

> - Is this just generating documentation? The -doc package is architecture: all,
> can't we just skip building the docs on binary-arch builds?

Fair point. That's definitely might be something to fix in the
packaging. I'll try to check this thing. However, I will be away out of
town and without my usual computing resources for a few days (more or
less until next Tuesday), so I am not sure if I will be able to do this
quickly.

Thanks, Giovanni.
--
Giovanni Mascellani <[hidden email]>
Postdoc researcher - Université Libre de Bruxelles


signature.asc (235 bytes) Download Attachment