Concluding "What should happen when maintscripts fail to restart a service?"

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Concluding "What should happen when maintscripts fail to restart a service?"

Sean Whitton
tag 802501 + wontfix
user [hidden email]
usertag 802501 = normative stalled
thanks

Hello,

In #904558 I asked the T.C. for advice about how to move #802501
forward.  Their ultimate response was to recommend that a working group
of developers come up with some method, other than exiting nonzero, for
a maintscript to indicate that it failed to restart services.  Let me
take this opportunity to thank all those who were involved in #904558.

In this message, I seek to explain my understanding of what the closing
of T.C. bug #904558 means for debian-policy bug #802501, and those
merged with it.  Apologies for the length.  I wanted this general sort
of reasoning to be recorded somewhere for reference in future
discussions.

~ ~ ~

When the Policy Changes Process fails to establish consensus, we have a
few options.  If we think that consensus hasn't been established only
because no-one has volunteered to come up with an adequately detailed
response to the problem uncovered by the filing and discussion of the
bug, and the bug has been open for a while with no evidence of anyone
working on it, we (the Policy Editors) will often just close the bug.
We don't want such things to stick around, clogging up the list of open
issues in a way that's demotivating.  This is the 'obsolete' usertag.

If we think that consensus hasn't been established because there are
good arguments on all sides, but we (the Policy Editors) additionally
think that argument to determine the very best solution is less
important right now than settling on one of the possible solutions
rather than remaining in further discussion, then we can refer the bug
to the T.C. to make a call between the competing options.  This was, I
think, the intended purpose of the 'ctte' usertag, but we haven't been
using it.

Finally, if we don't want to refer the bug to the T.C. -- generally
because it's not important enough -- but we think that closing the bug
would be counterproductive because someone else will just open a new bug
raising the same issue again at some near point in time, we can just
leave the bug open, as a kind of placeholder to hopefully reduce the
number of duplicate bugs filed.  I just added a 'stalled' usertag for
this case.

The 'obsolete', 'ctte' and 'stalled' usertags are meant to be used in
addition to the 'wontfix' tag.

~ ~ ~

In #904558, I did not ask the T.C. to rule on what maintscripts should
do when they fail to restart a service.  Rather, I asked them to weigh
in on the decision between the options described above, given that the
Policy Changes Process had failed to achieve consensus.  However, in the
message closing #904558, the T.C. indicated that they declined to issue
a ruling about what maintscripts should do when they fail to restart a
service.  So the second option described above, corresponding to the
'ctte' usertag, has been taken off the table.

That leaves us with the question of whether to leave #802501 open, in
the absence of the possibility of closing it by having the T.C. make a
call.  Given that this bug has already been filed (at least) twice, I
think it would be best for us to leave it open.  So I'm tagging
wontfix+stalled.

~ ~ ~

In filing #904558, I made an alternative suggestion to the above:

> As a Policy delegate I want to move this issue along, and I can see
> three ways of doing that:
>
> 1. write a patch to explicitly state in Policy that what happens when a
>    service (re)start fails in a maintscript is left up to package
>    maintainer discretion, and close the bugs
> [...]

I no longer think this would be useful enough to have in Policy, but I'd
like to hear from anyone who disagrees.

--
Sean Whitton

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Concluding "What should happen when maintscripts fail to restart a service?"

Gunnar Wolf via nm
Hello Sean,

> In #904558 I asked the T.C. for advice about how to move #802501
> forward.  Their ultimate response was to recommend that a working group
> of developers come up with some method, other than exiting nonzero, for
> a maintscript to indicate that it failed to restart services.  Let me
> take this opportunity to thank all those who were involved in #904558.
>
> In this message, I seek to explain my understanding of what the closing
> of T.C. bug #904558 means for debian-policy bug #802501, and those
> merged with it.  Apologies for the length.  I wanted this general sort
> of reasoning to be recorded somewhere for reference in future
> discussions.
Thank you for providing this framing and for helping us (me, at
least!) better understand the circumstances of your bug filing. Quite
probably, I should have probably read #802501 during the #904558
discussion (and it's a very short bug FWIW), but didn't. Understanding
the bug follow-up policy of the Policy Editors makes me more at ease
with what we (TC) decided — We were not the first ones to fail to find
an always-good solution :-|

Now, I would more than welcome this bugs to be pushed to the right
areas: To d-devel, or to a new, specialized working group tackling the
issue. Both in the bugs and in our discussions, it is often repeated
(quoting here Sam, from #802501) «as a distribution, I think we should
explicitly encourage people to consider the consequences on
dist-upgrade and other operations». Inconsistently failing is *not*
OK, and nobody implied it that way...

So,

> The 'obsolete', 'ctte' and 'stalled' usertags are meant to be used in
> addition to the 'wontfix' tag.
>
> ~ ~ ~
>
> In #904558, I did not ask the T.C. to rule on what maintscripts should
> do when they fail to restart a service.  Rather, I asked them to weigh
> in on the decision between the options described above, given that the
> Policy Changes Process had failed to achieve consensus.  However, in the
> message closing #904558, the T.C. indicated that they declined to issue
> a ruling about what maintscripts should do when they fail to restart a
> service.  So the second option described above, corresponding to the
> 'ctte' usertag, has been taken off the table.
>
> That leaves us with the question of whether to leave #802501 open, in
> the absence of the possibility of closing it by having the T.C. make a
> call.  Given that this bug has already been filed (at least) twice, I
> think it would be best for us to leave it open.  So I'm tagging
> wontfix+stalled.
I want to interpret this wontfix+stalled, and the TC answer ("The
Technical Committee does not engage in design of new proposals and
policies") don't mean this problem will just lay dormant and unsolved
forever. As Marga said in her mail closing this bug, «While we
recognize that this is a problem worth fixing, this is not something
that we can fix as a body and need the help of the Developers to do
it.»

I want to insist on our recommendation to create a work group of
developers to tackle this issue. Maybe we can start it off as a BoF
session in DC19?

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Concluding "What should happen when maintscripts fail to restart a service?"

Sean Whitton
Hello Gunnar,

On Thu 20 Jun 2019 at 11:18am -0500, Gunnar Wolf wrote:

>> ~ ~ ~
>>
>> In #904558, I did not ask the T.C. to rule on what maintscripts should
>> do when they fail to restart a service.  Rather, I asked them to weigh
>> in on the decision between the options described above, given that the
>> Policy Changes Process had failed to achieve consensus.  However, in the
>> message closing #904558, the T.C. indicated that they declined to issue
>> a ruling about what maintscripts should do when they fail to restart a
>> service.  So the second option described above, corresponding to the
>> 'ctte' usertag, has been taken off the table.
>>
>> That leaves us with the question of whether to leave #802501 open, in
>> the absence of the possibility of closing it by having the T.C. make a
>> call.  Given that this bug has already been filed (at least) twice, I
>> think it would be best for us to leave it open.  So I'm tagging
>> wontfix+stalled.
>
> I want to interpret this wontfix+stalled, and the TC answer ("The
> Technical Committee does not engage in design of new proposals and
> policies") don't mean this problem will just lay dormant and unsolved
> forever. As Marga said in her mail closing this bug, «While we
> recognize that this is a problem worth fixing, this is not something
> that we can fix as a body and need the help of the Developers to do
> it.»
>
> I want to insist on our recommendation to create a work group of
> developers to tackle this issue. Maybe we can start it off as a BoF
> session in DC19?
My reading of the conclusion to #904558 is that the recommendation to
form a working group is a recommendation that can be directed only to
the developer body as a whole, not to the Policy process.  That's
because actually implementing in the archive some new mechanism for
maintscripts is a prerequisite to any Policy change requiring packages
to use that new mechanism.  In other words, what the working group would
be tasked with doing is beyond the scope of the Policy process.  We do
design work as part of the Policy process, but we don't write code.

Assuming that the T.C.'s recommendation is the right way to proceed
here, and someone doesn't come up with any other way to unblock things,
the wontfix+stalled status will remain until and unless the working
group actually forms, designs and implements something, and starts using
it in the archive.  There is no role for the Policy process (and thus no
role for the Policy Editors qua Policy Editors) until that occurs.

So, by all means insist on the recommendation, but so far as I can tell
that's something which does not involve either the Policy process or the
T.C., but simply the body of Debian contributors qua contributors.

Stepping back a bit, tagging this bug wontfix+stalled is part of the
broader attempts, in which the Policy Editors are engaged, to more
sharply delineate the boundaries of the Policy process.  We want to get
to the point where the only bugs that we have listed are either
highly actionable, or tagged wontfix.  For a bug to be highly actionable
is for it to be the case that someone with enough time and background
knowledge can sit down, think through the problem, and come up with at
least a first version of a change proposal.

I think that a large number of very-difficult-to-action bugs strongly
discourages people from getting involved.  It makes Policy seem like a
sprawling, unmanageable morass of difficult problems.  That isn't how
things are, because while there are indeed a lot of hard problems, they
are largely independent of each other, and tackling individual
debian-policy bugs really does improve Debian.  However, it is much
harder to see that when half of the open bugs are more than five years
old yet not tagged wontfix.

--
Sean Whitton

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Concluding "What should happen when maintscripts fail to restart a service?"

Gunnar Wolf via nm
Sean Whitton dijo [Fri, Jun 21, 2019 at 02:36:05PM +0100]:

> My reading of the conclusion to #904558 is that the recommendation to
> form a working group is a recommendation that can be directed only to
> the developer body as a whole, not to the Policy process.  That's
> because actually implementing in the archive some new mechanism for
> maintscripts is a prerequisite to any Policy change requiring packages
> to use that new mechanism.  In other words, what the working group would
> be tasked with doing is beyond the scope of the Policy process.  We do
> design work as part of the Policy process, but we don't write code.
>
> Assuming that the T.C.'s recommendation is the right way to proceed
> here, and someone doesn't come up with any other way to unblock things,
> the wontfix+stalled status will remain until and unless the working
> group actually forms, designs and implements something, and starts using
> it in the archive.  There is no role for the Policy process (and thus no
> role for the Policy Editors qua Policy Editors) until that occurs.
>
> So, by all means insist on the recommendation, but so far as I can tell
> that's something which does not involve either the Policy process or the
> T.C., but simply the body of Debian contributors qua contributors.
I completely agree with you - My idea to kickstart this at DC19 is not
for TC and Policy Editors leading a session, but rather us (as
individuals) expressing the issue at a BoF trying to get more eyeballs
(and, more important, more brains) on it.

> Stepping back a bit, tagging this bug wontfix+stalled is part of the
> broader attempts, in which the Policy Editors are engaged, to more
> sharply delineate the boundaries of the Policy process.  We want to get
> to the point where the only bugs that we have listed are either
> highly actionable, or tagged wontfix.  For a bug to be highly actionable
> is for it to be the case that someone with enough time and background
> knowledge can sit down, think through the problem, and come up with at
> least a first version of a change proposal.
>
> I think that a large number of very-difficult-to-action bugs strongly
> discourages people from getting involved.  It makes Policy seem like a
> sprawling, unmanageable morass of difficult problems.  That isn't how
> things are, because while there are indeed a lot of hard problems, they
> are largely independent of each other, and tackling individual
> debian-policy bugs really does improve Debian.  However, it is much
> harder to see that when half of the open bugs are more than five years
> old yet not tagged wontfix.
Right. This is a bug where I was quite happy that the TC decided to
declare it outside of its functions - And it is just fitting that it's
outside the Policy as well. We don't have a commonly implemented
practice to document / show / follow. This should go to the developer
body at large.

signature.asc (849 bytes) Download Attachment