Resuming discussion on Runtime-Depends [Was: Bug#804624: please improve support for installing foreign packages to chroots and add DPKG_ROOT]

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Resuming discussion on Runtime-Depends [Was: Bug#804624: please improve support for installing foreign packages to chroots and add DPKG_ROOT]

Helmut Grohne
Hi Guillem,

I recently proposed restarting the discussion on Runtime-Depends, and
you asked me to follow up on #804624. Let me first pick up from where
you left and then move possible steps forward for Runtime-Depends.

On Sun, Apr 17, 2016 at 11:23:26PM +0200, Guillem Jover wrote:

> On Wed, 2016-03-30 at 08:48:45 +0200, Helmut Grohne wrote:
> > I do have an answer to the absence of Maint-Depends now: Also add
> > Runtime-Depends. Then Depends would simply beam both Maint-Depends and
> > Runtime-Depends like Build-Depends means both Build-Depends-Arch and
> > Build-Depends-Indep. I note that even without the rest of the changes,
> > the splitting of Depends would make deity (or at least Don) a little
> > happier.
>
> I'm not sure why it would make deity happier, it would still need to
> satisfy both when installing stuff. Also the rpm equivalent has
> instances for pre and post, and in that scenario Pre-Depends might
> also deserve splitting I guess, which means a myriad of new fields. :/

Not exactly. A core idea of Runtime-Depends is that you'd need them to
use a package, but you'd not need them to run maintainer scripts.
Essentially, that means that it becomes easier to run maintainer
scripts, since any dependency loops that go via Runtime-Depends become
irrelevant and you can untangle the mess without breaking anything.

> If going the full rpm way, this also might imply in many cases duplicate
> information in multiple fields. Say you need package-x in postinst and
> prerm, then we'd need to include it twice in Maint-Depends-Preinst and
> Maint-Depends-Prerm for example. This does not happen currently as you
> list those packages once only in the weakest field necessary. Also I'm
> not sure how rpm scriplets really work, but in our case our rollback
> mechanism in case of errors involves jumping from post to pre and the
> other way around, so having such fine grained separation does not seem
> worth it to me.

Let me express agreement with the conclusion. Another reason to reach
that conclusion is that these split-up dependencies would often be equal
as different scripts tend to use the same tools in practice.

> Even assuming a simple two-way split between Runtime and Maint
> dependencies has other potential issue, such as triggers which are
> out-of-band (and not always declarative). If the package manager
> frontends allowed to remove packages which are only maintscript
> dependencies then this would be a mess. Another similar case is
> disappearing packages which are also out-of-band events, another package
> might completely replace an existing one w/o the latter having any
> previous knowledge of that fact, and that's file-based so not something
> a frontend can predict. I can imagine that just removing maintscript-only
> dependencies might cause dependency issues.

There certainly are open issues with Maint-Depends. I have no answer on
what dependencies you need to run triggers.

> There are at least two main use cases for this split of the dependencies
> as you've mentioned: to make running the maintscripts from an external
> environment easier, and to be able to remove them in case of generating
> stripped down embedded images.

I am proposing that there is a third major use case. Reduced
dependencies for running maintainer scripts can make computing upgrade
sequences simpler.

Presently, all dependees must be configured before running postinst of
the depender. With the split, only regular Depends (and Maint-Depends)
would need to be configured for running postinst. After running the
script, the depender transitions to a new state until all
Runtime-Depends are met as well, at which point it is considered
configured (or has pending triggers).

> The first one I think is better served by trying to:
>
>   1) remove as many maintscripts as possible, via triggers for example,
>      or simply by making them unnecessary.

Work has progressed significantly on this since you wrote that, but it
is still in-progress and a number of scripts will likely not go away
soon.

>   2) split the installation bootstrap logic into a different
>      maintscript, as described in the InstallBootstrap spec.

No progress has been made on this, but the idea sounds sensible to me.
A little downside I see here is that one needs to maintain parts of the
maintainer scripts twice (which becomes less of an issue as we progress
on 1 and 3).

Indeed, I am wondering whether supporting chrootless upgrades is useful.
The additional maintscript would not support that, but narrowing the use
case also makes the implementation simpler. Unfortunately, I don't have
a good estimate on the maintenance cost of these additional scripts.

>   3) switch to a more declarative way of doing things.

Limited progress has been made here. You probably know more.

> Which I think would be a very welcomed initiative by the project at
> large.

Yes.

> The second depends on how much of a problem this really is. Do we know
> if this would avoid 5 packages or 100, and how much those would weight
> in terms of space or transitive dependencies? Because depending on the
> size of the problem this becomes a non-argument IMO.

It is difficult to gather concrete numbers here. I guess that it mostly
helps with very small images. Let me give some examples:
 * It seems very likely that debconf could be made non-essential using
   this technique. Very few packages need debconf outside maintainer
   scripts.
 * lighttpd used to have a dependency on perl (not perl-base)
   exclusively for maintainer scripts. This dependency has since been
   removed though.
 * When shrinking embedded images, getting rid of perl-base provides
   noticeable savings. perl is used in a number of maintainer scripts.
   Converting them to not use perl does not appear to be a useful task.
   Converting runtime tools to not use perl, seems more manageable. And
   of course, a prerequisite for doing this is getting rid of debconf.
 * udev .hwdb source files take up around 6MB. These are compiled during
   postinst. It would be nice to be able to drop the .hwdb source files
   and the hwdb compiler. Doing so would become possible if udev split
   these out to an extra package in Maint-Depends.

> This one in addition
> of being helped by some of the previous changes, could also be handled as
> simply informational annotations, such as a new field such as:
>
>   Package: core-package
>   Depends: libcore, tool-a, tool-b
>   Maint-Only-Depends: tool-b
>
> Which of course also has the problem of duplicated metadata, but is at
> least really non-intrusive with the dependency solvers and much of our
> tooling.

I guess that we have to do it this way anyway, because we need a
backwards-compatible upgrade path. Simply introducing Maint-Depends and
expecting everyone to honour it, is not going to work. So it seems
likely that we'll have to duplicate dependencies to demote them to
maint-only or runtime-only. This is a bit unlike what we do with
Build-Depends-Arch/Indep.

> So I hope you understand that my overall ecosystem complexity alarms
> have all gone up. Which at the same time always feels bad because it
> seems like a reaction against progress (even if that might end up
> being misplaced :) !

Well yes, this is adding complexity of course.

> > dpkg-checkbuilddeps does not lock the external database. Why would dpkg
> > have to?
>
> I don't think this is the right question to counter that argument.
> It's a valid question on its own though. The point is that these are
> two separate universes. The installed system must preserve integrity
> at all costs, when that is lost your running system is broken and it
> might stop running at all, stop booting, etc. If the dependencies
> disappear while you are building, at most you get a broken build. You
> could always retry it and that does not affect the integrity of the
> system as long as you don't try to install broken packages for
> example.

The major use case for chrootless is operating on images. I'd compare it
more to package building than to operating on an installation. The
installation that you don't want to break is the outer one, not the
inner one.

> Is it dangerous to change the package state while building? Certainly!
> And we might want to perhaps run dpkg-checkbuilddeps after the build
> is finished and abort if the deps are not satisfied. This still leaves
> a big window inbetween where packages might have been removed and
> added back though.

I don't think this is a practical problem to solve (given that everyone
builds in sbuild or pbuilder) and I don't think it needs to be solved
for chrootless either, because we'll contain it in a similar way as
package builds.

> > What is the point in making different assumptions on dpkg and on
> > dpkg-checkbuilddeps? Both construct something external to the current
> > installation.
>
> Because in the dpkg chroot-less case we are still operating on the
> chroot contents so integrity is paramount. But see above.

I tend to disagree that integrity is important when working on chroots.
This can be a trade-off. When you chroot into a crashed system from a
rescue image, don't use chrootless.

> > Also from the point of creating small Debian installations, splitting
> > Depends into pieces would be preferable. rpm already does that:
> >
> > Requires(post): foo
> >
> > This would allow us to strip an essential installation of packages only
> > required for configuring packages.
> >
> > You see, I have a strong preference for allowing arbitrary packages to
> > be required from the outer installation.
>
> This is still very Debian specific, as in it requires the external
> environment to be a Debian system too. Even worse, ISTM it is even
> suite specific! Say you depend on a package conf-a >= 2.0 un suite 1.0
> from the maintscripts, but the external distribution with suite 2.0
> contains conf-a 2:1.0 (even though this would probably even cause
> problems on upgrades). Another actual case would be if the depends
> would be on something like git, which has been different things in
> Debian depening on the suite. And of course different derivatives, or
> distributions based on dpkg but not necessarily transitively on Debian
> do not share the Debian package-version namespace.

The more, I've thought about this, the more I am convinced that the
external environment must be a system (or chroot itself) of matching
suite. The major use case is constructing operating system images or
special chroots.

I think it is safe to say that Maint-Depends does not have "obvious"
semantics. There are many questions left unanswered for it with
chrootless and without. It doesn't seem like we're going to make
progress on these soon. A prerequisite seems to be converting more
scripts and tools for DPKG_ROOT. Notably:
 * dpkg-maintscript-helper (wip branch)
 * update-alternatives (wip branch)
 * debconf (no clue)
 * base-files (#824594)
 * debhelper (systemd service starting)
 * add-shell/remove-shell (from debianutils)
 * glibc (#910685)
 * something about shadow

On the other hand the idea of Runtime-Depends seems way more obvious.
Let us initially pretend there was no transition necessary. We'd have
another field called Runtime-Depends with the same syntax as Depends. No
other field would get a Runtime- companion. Since Suggests and
Recommends don't have to be satisfied, they are irrelevant to the
discussion. A runtime variant of Pre-Depends simply doesn't make sense.
Any maintainer script can be run with just Depends, but not
Runtime-Depends installed. After scripts have been successfully run, the
package transitions to a new state that indicates it is waiting for
runtime dependencies. Once those are installed, it transitions to
triggers-awaited or installed.

In contrast to Maint-Depends, it is relatively simple to come up with
semantics for Runtime-Depends. The semantics are independent of
chrootless. Every stable release we run into tricky upgrade scenarios
often caused by dependency cycles. Runtime-Depends could help break such
cycles.

Unfortunately, simply adding the field is not going to just work. We can
add it, but we cannot simply move dependencies to Runtime-Depends as
that would break backwards-compatibility. Likely the
Runtime-Only-Depends field you suggested earlier for the Maint variant
is the way to get there. Any element of Runtime-Only-Depends is
discarded from Depends if the implementation supports that. If
Runtime-Depends is added at the same time as Runtime-Only-Depends, a
second step could be removing Runtime-Only-Depends to complete the
transition (after multiple stable releases). A possibly faster route
could be requiring a versioned Pre-Depends on dpkg for using
Runtime-Depends.

The big question seems to be: Is the projected Runtime-Depends feature
worth the effort on its own (without Maint-Depends)?

Helmut