tag2upload (git-debpush) service architecture - draft

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

tag2upload (git-debpush) service architecture - draft

Ian Jackson-2
Hi all.

I wrote this draft design doc / deployment plan for the tag-to-upload
service, perhaps best summarised by Sean like this:

  We designed and implemented a system to make it possible for DDs to
  upload new versions of packages by simply pushing a specially
  formatted git tag to salsa.

  Please see this blog post to learn about how it works:
  https://spwhitton.name/blog/entry/tag2upload/

The server side of this is not running yet and there is some work to
do for that.

We've had a number of peripheral conversations, and informal
internal reviews, but I think it's the stage now to have a public
design review etc.  I'm CCing this to -devel because I just did a
lightning talk demo of the prototype and IME many people are
interested in these kinds of questions.

Right now this document is maintained here:
   https://salsa.debian.org/dgit-team/dgit/tree/wip.tag2upl-draft
but NB that that is a potentially rewinding branch.  (I probably won't
rewind it until it's time to fold it into master at which point I may
just delete it.)

Ian.


TAG-TO-UPLOAD - DEBIAN - DRAFT DESIGN / DEPLOYMENT PLAN
=======================================================

Overall structure and dataflow
------------------------------

 * Uploader (DD or DM) makes signed git tag (containing metadata
   forming instructions to tag2upload service)

 * Uploader pushes said tag to salsa. [1]

 * salsa sends webhook to tag2upload service.

 * tag2upload service
    : provides an HTTPS service accessible to salsa's IP addrs
    : fishes url and tag name out of webhook json
    ! checks that url is basically sane
    - retrieves tag data (git shallow clone)
    ! parses the tag metadata
    ! checks to see if it is relevant
    ! verifies signature
    ! checks to see if signed by DD, or DM for appropriate package
    - obtains relevant git history
    - obtains, if applicable, orig tarball from archive
    - makes source package
    # signs source package and "dgit view" git tag
    - pushes history and both tags to dgit git server
    - uploads source package to archive

 * archive publishes package as normal

[1] In principle other git servers would be possible but it would have
to be restricted to ones where we can either avoid, or stop, them
being used as a channel for a DoS attack against the tag2upload
service.

Service architecture
--------------------

I propose the following architecture for the tag2upload service.

 * Packet filter limiting the incoming connections to salsa.

 * Conventional webserver offering TLS and using Let's Encrypt.
   (Alternatively, HTTP could be used, but in the future we
   might want to handle embargoed security uploads so let's not.)

 * Web-service-style "application server" written in some scripting
   language listens on a local TCP port, handles HTTP connections
   proxied by the webserver, parses the JSON, and connects to:

 * Trusted service daemon.  Listens on a TCP connection and accepts a
   simple line-based "url tag" protocol.  Checks urls and tags for
   basic syntax and sanity (eg that it has the right protocol and
   host).  Keeps track of incoming requests in a sqlite3 database so
   that execution can be deferred and retried as applicable.  Spawns
   per-request worker children.

 * Request processor.  Trusted.  Does the trusted parts above.

 * Some VM or container or maybe chroot.  Instantiated by request
   processor via adt-virt protocol.  Request processor controls this
   by sending it commands (via the adt-virt facility for this).

 * In the VM, git is used to fetch all the bits and dgit does the
   actual source package generation work.

 * Trusted service daemon needs access to its GPG key which should be
   on a hardware token and not accessible to the VM instances.

Privsep
-------

The tag2upload service will have to have a signing key that can upload
source packages to the archive.

We do not want that signing key to be abused.  In particular, even
though it will be in a hardware token we want to avoid giving
unrestricted access to that key to code which also has a large attack
surface.  In particular, source package construction is very complex.

So there will be a privilege separation arrangement, as described
above.  Different tasks run in a different security context:

    ! is fully trusted and has access to the signing key

    - runs in the discardable VM or container, controlled by `!'

    # is achieved by the `dgit rpush' protocol, where the trusted
      (invoking, signing) part offers a restricted signing oracle to
      the less-trusted (building) part.  The signing oracle will check
      that the files to be signed are roughly in the right form and
      that they name the right source package.  It will construct the
      "dgit view" git tag itself from metadata provided by the
      building part.

    : can run as different unix users or even different VMs or
      something, if desirable

Reproducibility, metdata and auditing
-------------------------------------

The trusted part of the tag2upload service will keep some logs,
particularly of each tag it is told about and what the disposition of
that was, and when it was retried.

Also, it will send the following information to a public mailing list:
  - The tag object data for any tag it decides to process,
     before it passes it to the VM.
  - A report (more or less, a shell transcript)
     of each processing attempt
  - The list will also be the public email address of the
     tag2upload robot's signing key

The generated .dscs will contain additional fields

  Git-Tag-Tagger: Firstname Surname <email@address>

      "tagger" line from the git tag converted to deb822 format

  Git-Tag-Info: tag=<tagobjid> fp=<fingerprint> algos=1,8

      <tagobjid> is the git object ID of the tag object
          (if someone wants to find this, it can be found on the
           dgit git server)

      <fingerprint> is the "fingerprint_in_hex" from the VALIDSIG line
      in the gpgv output.  algos is the <pubkey-algo> and <hash-algo>
      (here, 1,8 as examples).

This additional metadata is necessary to be able to tell by looking at
the .dsc who the original uploader was (which might be different to
the maintainer, in the sponsorship case).  (Programs which use the
uploader signature identity will send mails to the mailing list
mentioned above, until they have been updated.  This is not desirable
but not a blocker for deployment.)

The generated .changes will contain copies of the two .dsc fields
above.

The upload will contain a .source_buildinfo.  This will list the
versions of the software running in the VM, which is primarily what
controls the generated .dsc.

It will also list the versions of dgit-infrastructure and git running
in the trusted part, because the trusted part assembles the tag lines
etc. and interprets the git tag.

Eventually hopefully there will be a mode for sbuild (related to
binary build reproduction), or a suitable script, which can verify a
reproduction attempt.  For now the src:dgit test suite will check that
the upload is reproducible if run again in the same environment.

DoS
---

This service is not very resistant to DoS attacks.  In particular,
sending it bad URLs might stall it (since it has to retry failing
URLs).

So we (i) do not expose it to anyone but salsa and (ii) limit it to
trying to fetch salsa urls.

Making very many tags on salsa would stress this tag2upload service a
bit but not fatally, and it would be a DoS against salsa too.

After signature verification, we are much more vulnerable to DoS.  An
approved signer can get the service to do a lot of work.  That is the
purpose of the service, indeed.

--
Ian Jackson <[hidden email]>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Bastian Blank
Hi Ian

On Wed, Jul 24, 2019 at 02:56:22AM +0100, Ian Jackson wrote:
> We've had a number of peripheral conversations, and informal
> internal reviews, but I think it's the stage now to have a public
> design review etc.  I'm CCing this to -devel because I just did a
> lightning talk demo of the prototype and IME many people are
> interested in these kinds of questions.

We discussed a bit within the ftp team and several points came up.  The
following describes my interpretation of it:

The archive will need to do the final validation to check if an upload
is accepted.  The uploaders signature would need to be added to the
source package to allow checking the validity also in the future.  We
already retain all user signatures of source packages in the archive and
such a proposed service must provide the same level of possible
verification.

The signature needs to be collision resistant and needs to be verifyable
with only the stuff included into the source package.  The git object
checksums don't suffice anymore due to SHA1.  And as the world moves
towards SHA3, it will need to have the ability to follow.  The output of
all operations obviously needs to be reproducible to be signed.

I don't know if any of this requires a new dpkg source format to
implement properly.

The service still might need credentials of it's own, but no permissions
will be attached to it.  And whatever you do, don't use Perl as
implementation language.

I would like to have such a service.  However it would have been nice
for you to talk about the verification requirements before you ask for a
key and a way to circumvent the archive upload checks and restrictions.

Regards,
Bastian

--
        "We have the right to survive!"
        "Not by killing others."
                -- Deela and Kirk, "Wink of An Eye", stardate 5710.5

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Rebecca N. Palmer-2
As a way to avoid relying on SHA-1, would it work to have git-debpush
include a longer hash in the tag message, and tag2upload also verify
that hash?

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Bernd Zeimetz
On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:
> As a way to avoid relying on SHA-1, would it work to have git-debpush
> include a longer hash in the tag message, and tag2upload also verify
> that hash?

what exactly would you create that long hash of?

If we don't trust sha-1, then we might also not be able to trust the
linked list of commits a git tag is pointing to.


--
 Bernd Zeimetz                            Debian GNU/Linux Developer
 http://bzed.de                                http://www.debian.org
 GPG Fingerprint: ECA1 E3F2 8E11 2432 D485  DD95 EB36 171A 6FF9 435F

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Rebecca N. Palmer-2
On 28/07/2019 10:58, Bernd Zeimetz wrote:
> On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:
>> As a way to avoid relying on SHA-1, would it work to have git-debpush
>> include a longer hash in the tag message, and tag2upload also verify
>> that hash?
>
> what exactly would you create that long hash of?

The signer's local files when they run git-debpush.  (To be decided: how
to define the hash of a directory tree (as opposed to a single file),
i.e. "tar | sha256 like a .dsc" or "what git uses but sha256".)

The hash security is for ensuring that tag2upload is seeing the same
content as the signer did, and not something different an attacker
placed on Salsa.  (If the attacker can get their changes into the
signer's local copy without the signer noticing, we'd have a problem
whatever method the signer uses to upload it.)

This does sort of raise the question of why not prefer "keep .dscs, but
hide them from the user and regenerate tarballs", but this might be
inappropriately reopening an already decided issue.  (I remember it
being suggested before, but not what (if any) response this got.)

(+/=/- are relative to the existing proposal)
+ Security: dak doesn't have to trust dgit-repos-server
  (avoids both weak hashes and potential bugs)
+ Compatibility: finding the signer's name from the .dsc still works
= Uploader only needs to do 'git debpush'
= Doesn't spend uploader's (possibly low/expensive) bandwidth on
uploading what Salsa already has
- Someone would have to implement it
  (if that's me - not in Perl and I'm not a DD or a security specialist)

git-debpush:
     create .dsc # as normal
     create tag # as normal, only needs version number
     sign tag # not strictly required, but since the next step
     # needs a key anyway, good to automate best practice
     sign .dsc
     push tag to Salsa
     upload .dsc to dgit-repos-server # but not its tarballs

dgit-repos-server --tag2upload:
     receive .dsc
     check .dsc signature # do this first to prevent DoS
     # maybe also check the version number to prevent DoS by
     # re-submitting old/non-Debian .dscs
     fetch source from Salsa
     create source package tarballs
     check if these match the .dsc hashes # not strictly required as
     # dak will do it again anyway, but easy
     dput the .dsc+tarballs # as normal

# not sure where .changes fits into this:
# replace ".dsc" by ".dsc+.changes" throughout?
# or have dgit-repos-server create .changes as if it were a buildd?

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Ian Jackson-2
Rebecca N. Palmer writes ("Re: tag2upload (git-debpush) service architecture - draft"):
> The signer's local files when they run git-debpush.  (To be decided: how
> to define the hash of a directory tree (as opposed to a single file),
> i.e. "tar | sha256 like a .dsc" or "what git uses but sha256".)

This would of course be possible.  I don't think it's a particularly
good idea though.  What it amounts to is a parallel Merkle tree to the
git one, just with a different data format and a better hash.

The upside is the better hash, but I think our overall risk from the
git SHA-1 problem is (i) still in practice quite low (ii) exists in
all the other places we rely on git already.

The downside is that the tag is no longer just a normal signed git tag
with some easy to construct and easy to understand metadata.  It will
in practice then not be practical to make this tag other than with
git-debpush (or some other special utility with the same code).

Ian.

--
Ian Jackson <[hidden email]>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Rebecca N. Palmer-2
On 28/07/2019 16:18, Ian Jackson wrote:
> What it amounts to is a parallel Merkle tree to the
> git one, just with a different data format and a better hash.

Not really: it wouldn't need the history tree structure (in Git terms
[0], it would be a tree object not a commit object), and if we use
tar+sha256 [1], it wouldn't need the hash-per-file directory tree
structure either.

> The upside is the better hash, but I think our overall risk from the
> git SHA-1 problem is (i) still in practice quite low

For attacks happening now, I agree (but am not an expert): my intent in
suggesting this was "this is an easy way to have a better hash if we
want it", not to take a side on the question of whether we need it.

This may change, but we have the option of implementing this fix then
(and if it happens suddenly, temporarily disabling tag2upload to give us
time to do so).

> (ii) exists in
> all the other places we rely on git already.

That suggests that working towards requiring the SHA-256 mode of git
(which at least sort of exists since 2.21 [2], but I don't know if it's
usable yet) might be a better use of effort.

[0] https://git-scm.com/book/en/v2/Git-Internals-Git-Objects
[1] needs reproducibility, but simpler than pristine-tar in that we're
only trying to create _a_ reproducible tarball (not match one created by
upstream) and don't need to compress it (as it can be deleted after
hashing - unfortunately tar doesn't obviously have a write-to-stdout
option to allow tar | sha256).  Reproducible builds suggests tar
--sort=name --owner=0 --group=0 --numeric-owner.
[2]
https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Sean Whitton
Hello,

On Sun 28 Jul 2019 at 07:05pm +01, Rebecca N. Palmer wrote:

> On 28/07/2019 16:18, Ian Jackson wrote:
>> What it amounts to is a parallel Merkle tree to the
>> git one, just with a different data format and a better hash.
>
> Not really: it wouldn't need the history tree structure (in Git terms
> [0], it would be a tree object not a commit object), and if we use
> tar+sha256 [1], it wouldn't need the hash-per-file directory tree
> structure either.

When I read your first e-mail what I thought you had in mind was just
this -- having git-debpush compute a stronger hash of the tree object
and add that to the tag metadata, ignoring commit objects.

But now I'm struggling to understand the relevance of your discussion of
having git-debpush create a .dsc in your second e-mail, if what you're
actually talking about is hashing a git tree object.

(As an aside, if what you want is to hide .dsc creation from the user
but still do it on their machine and upload it, `dgit push-source` is
already available.)

On Sun 28 Jul 2019 at 04:18pm +01, Ian Jackson wrote:

> The downside is that the tag is no longer just a normal signed git tag
> with some easy to construct and easy to understand metadata.  It will
> in practice then not be practical to make this tag other than with
> git-debpush (or some other special utility with the same code).

This is a downside, but it's not a permanent one -- it goes away if git
switches away from SHA-1, which perhaps it is reasonable to expect
eventually.

It would be good to hear responses to Rebecca's suggestion from those
who disagree that it is okay to rely on SHA-1 in the particular way that
git-debpush/tag2upload does.

--
Sean Whitton

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Rebecca N. Palmer-2
On 28/07/2019 20:01, Sean Whitton wrote:
> When I read your first e-mail what I thought you had in mind was just
> this -- having git-debpush compute a stronger hash of the tree object
> and add that to the tag metadata, ignoring commit objects.

Of the files in the signer's repository, not of an actual tree object
(since the second is a list of file/subtree SHA-1 hashes).

> But now I'm struggling to understand the relevance of your discussion of
> having git-debpush create a .dsc in your second e-mail, if what you're
> actually talking about is hashing a git tree object.

"Tag with sha256" and "hidden .dsc" are two alternative options: the
first is a narrowly targeted fix for the SHA-1 issue, the second a
bigger redesign.

> (As an aside, if what you want is to hide .dsc creation from the user
> but still do it on their machine and upload it, `dgit push-source` is
> already available.)

That doesn't push to salsa [0].  However, I agree that it otherwise does
solve the problem of "not making the uploader think about how Debian
source packages work", without requiring a server-side component.

This does still "waste" the uploader's bandwidth on tarballs, but I
don't know if that's an issue in practice.  For most packages [1] it is
a much smaller data volume than the downloads needed to keep an
up-to-date sid for building/testing the package.

[0] https://sources.debian.org/src/dgit/9.6/dgit-maint-gbp.7.pod/#L117
[1] Rough numbers: ~80% of .orig.tar.*z are <1MB, ~97% <10MB; a gcc
update is a ~30MB download.

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Bernd Zeimetz
In reply to this post by Bernd Zeimetz


On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:> As a way to avoid relying
on SHA-1, would it work to have git-debpush
> include a longer hash in the tag message, and tag2upload also verify
> that hash?
>
The other idea would be to convince git upstream to use something
better than sha1 - and after a bit of searching, I found

https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt

- Git v2.13.0 and later use a hardened sha-1 implementation by
default, which isn't vulnerable to the SHAttered attack.
Still sha-1, though.

- there is a plan to support sha256.

Googling a bit more found

https://stackoverflow.com/questions/28159071/why-doesnt-git-use-more-modern-sha

which gives some insight on the (plans for) implementation.


So I think the best thing to do is to get sha256 working in git and
force the usage of sha256 if you want to sign a tag for upload.



--
 Bernd Zeimetz                            Debian GNU/Linux Developer
 http://bzed.de                                http://www.debian.org
 GPG Fingerprint: ECA1 E3F2 8E11 2432 D485  DD95 EB36 171A 6FF9 435F

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Marco d'Itri
On Jul 28, Bernd Zeimetz <[hidden email]> wrote:

> So I think the best thing to do is to get sha256 working in git and
> force the usage of sha256 if you want to sign a tag for upload.
This cannot be a goal for this project since git upstream will need
apparently a few more years for the transition to sha-256 to happen.

--
ciao,
Marco

signature.asc (673 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Ansgar Burchardt-8
In reply to this post by Bernd Zeimetz
Bernd Zeimetz writes:
> On 7/27/19 8:16 PM, Rebecca N. Palmer wrote:> As a way to avoid relying
> on SHA-1, would it work to have git-debpush
>> include a longer hash in the tag message, and tag2upload also verify
>> that hash?
>>
> The other idea would be to convince git upstream to use something
> better than sha1 - and after a bit of searching, I found
[...]
> So I think the best thing to do is to get sha256 working in git and
> force the usage of sha256 if you want to sign a tag for upload.

That will take quite a while; we would probably need a version of git
supporting that in stable.

There are also other issues, for example:

 - Such a service would bypass various sanity checks on the archive
   side, including various permission checks.

 - Such a service would need to properly validate the PGP signature.
   The archive really shouldn't rely on a third-party service for this.
   (In particular the service in question here doesn't do that as far as
   I can tell.)

Ansgar

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Sean Whitton
In reply to this post by Rebecca N. Palmer-2
Hello,

On Sun 28 Jul 2019 at 09:55PM +01, Rebecca N. Palmer wrote:

> On 28/07/2019 20:01, Sean Whitton wrote:
>> When I read your first e-mail what I thought you had in mind was just
>> this -- having git-debpush compute a stronger hash of the tree object
>> and add that to the tag metadata, ignoring commit objects.
>
> Of the files in the signer's repository, not of an actual tree object
> (since the second is a list of file/subtree SHA-1 hashes).

Ah, right.

>> But now I'm struggling to understand the relevance of your discussion of
>> having git-debpush create a .dsc in your second e-mail, if what you're
>> actually talking about is hashing a git tree object.
>
> "Tag with sha256" and "hidden .dsc" are two alternative options: the
> first is a narrowly targeted fix for the SHA-1 issue, the second a
> bigger redesign.

Okay.

--
Sean Whitton

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Bastian Blank
In reply to this post by Rebecca N. Palmer-2
On Sun, Jul 28, 2019 at 07:05:49PM +0100, Rebecca N. Palmer wrote:
> That suggests that working towards requiring the SHA-256 mode of git (which
> at least sort of exists since 2.21 [2], but I don't know if it's usable yet)
> might be a better use of effort.

Please keep in mind that the archive needs to verify this.  How do you
intend to provide the required information within the existing source
package structure?

> [1] needs reproducibility, but simpler than pristine-tar in that we're only
> trying to create _a_ reproducible tarball (not match one created by
> upstream) and don't need to compress it (as it can be deleted after hashing
> - unfortunately tar doesn't obviously have a write-to-stdout option to allow
> tar | sha256).  Reproducible builds suggests tar --sort=name --owner=0
> --group=0 --numeric-owner.

For now "git archive" with tar output seems to reproducible from jessie
(2.1.4) to sid (2.23 rc).

Another idea, however we would need to trust some decompressors:

The hypothetical tool creates a complete .dsc file with the names and
checksums of the uncompressed files.  The user signed .dsc is put into
the tag.

The tag2upload service creates the .changes files with the names and
checksums of the compressed files.  It is then signed by the upload
tool.

Accepting a package with dak would looks more like this:
- Verify signature on .changes.
- Check for source-only (forced by the upload tool flag).
- Check checksums of included files.
- Verify signature of .dsc.
- Check ACL against user signature on .dsc.
- Decompress (this poses a DoS threat!).
- Check checksums of included decompressed files.
- Either:
  - accept compressed files as is.
  - re-compress (also DoS, due to large files), calculate new checksums,
    accept.

Due to the implicit compression of files listed in .dsc, I would say
this is a new source format.

Regards,
Bastian

--
A little suffering is good for the soul.
                -- Kirk, "The Corbomite Maneuver", stardate 1514.0

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Rebecca N. Palmer-2
There are at least 2 questions being debated here, and at least 5
proposed solutions, and they are frequently being confused.

The questions:

(1-trust) Is it acceptable in principle for the archive to trust a
tag2upload service?  (i.e. have tag2upload rather than dak be
responsible for checking the tag signature)

(2-hash) If yes, is it acceptable for tag2upload to rely on SHA-1?

The solutions ('git debpush' is the part running on the uploader's
system, 'tag2upload' is the part running on a server):

(a-sha1resign) git debpush pushes a special signed tag "please upload
this commit" (i.e. identified by sha1).  tag2upload creates a source
package from this, signs it with its own key, and dputs it.  (Ian's
original, [A])

(b-sha256resign)  As (a) except the tag also includes a sha256. [B+D]

(c-scriptedstatusquo) git debpush becomes an automated way to do what is
currently recommended, i.e. it creates and pushes a signed git tag (to
salsa and to dgit), creates tarballs, creates and signs .dsc+.changes,
dputs .dsc+.changes+tarball(s).  (This might be as simple as "dgit
push-source && git push --all --follow-tags" [C], but I haven't tested
that.)  tag2upload doesn't need to exist.

(d-tarballrecreator) git debpush creates and pushes a signed git tag,
creates and signs .dsc+.changes, and sends them (but _not_ the tarballs
they refer to) to tag2upload.  tag2upload creates the tarball(s) from
the git repo, and dputs the .dsc+.changes+tarball(s). [B+D]

(e-modifydak) Add at least some git-upload-related functionality to dak
itself, instead of a separate tag2upload service.  (This is more of a
family of solutions than a single option: the specific variant [E]
proposed by Bastian is close to (d), but the equivalent of (b) could
also be done this way.)

Table of advantages and disadvantages (+=better, -=worse, .=slightly
worse, compared to doing nothing):

abcde
Uploader's convenience:
+++++ Only need to know/type 'git debpush'
++ ++ Doesn't waste bandwidth on tarballs
Security:
--  - (1-trust) Requires trusting the new code
-     (2-hash) Relies on SHA-1
Implementation difficulty:
  -.-- Code doesn't already exist
  . -. Needs reproducible tarballs (d) or equivalent (b+e)
-- -  Requires (somewhere to run a) new service
     - Requires changes to dak
--  ? Breaks "get sponsor name from .dsc" tools
abcde

On 30/07/2019 16:54, Bastian Blank wrote:
> On Sun, Jul 28, 2019 at 07:05:49PM +0100, Rebecca N. Palmer wrote:
>> That suggests that working towards requiring the SHA-256 mode of git (which
>> at least sort of exists since 2.21 [2], but I don't know if it's usable yet)
>> might be a better use of effort.
>
> Please keep in mind that the archive needs to verify this.  How do you
> intend to provide the required information within the existing source
> package structure?

We don't: this is only trying to fix (2-hash), while you evidently
object to (1-trust).

Also, as hinted at by Marco, the SHA-256 mode of git doesn't work yet:

(with git 1:2.23.0~rc0-1; the config lines are from [0])
$ cat .git/config
[core]
         repositoryFormatVersion = 1
[extensions]
         objectFormat = sha256
         compatObjectFormat = sha1
[core]
         filemode = true
         bare = false
         logallrefupdates = true
$ git log
fatal: unknown repository extensions found:
         objectformat
         compatobjectformat


> Another idea, [...]  I would say
> this is a new source format.

I agree that implementing the whole of your proposal would require
modifying dak.  (I see it as "implement (some of) tag2upload inside
dak".)  This potentially has similar security implications to having dak
trust tag2upload: lower risk as it would be under the established
package/maintainers/sysadmins for such sensitive code, but higher impact
if gaining control of dak is worse/easier to hide than just being able
to upload.

However, it has two elements that could be useful for a
(d-tarballrecreator) scheme with current dak.  (They would then need to
be .dsc+.changes not just .dsc, as .dsc and .changes must be signed by
the same key [1].)

> a complete .dsc file with the names and
> checksums of the uncompressed files.

Not compressing the tarballs may make reproducibility easier.

> The user signed .dsc is put into
> the tag.

This would allow the git repo to be the only communication channel from
git debpush to tag2upload.  (As in (a/b-sha*resign), but I don't know if
this matters.)

[A] https://lists.debian.org/debian-devel/2019/07/msg00501.html
[B+D] https://lists.debian.org/debian-devel/2019/07/msg00596.html
[C] https://lists.debian.org/debian-devel/2019/07/msg00601.html
[E] https://lists.debian.org/debian-devel/2019/07/msg00641.html
[0]
https://sources.debian.org/src/git/1:2.22.0-1/Documentation/technical/hash-function-transition.txt/#L125
[1] https://salsa.debian.org/ftp-team/dak/blob/master/daklib/checks.py#L157

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Ian Jackson-2
In reply to this post by Bastian Blank
Bastian Blank writes ("Re: tag2upload (git-debpush) service architecture - draft"):
> We discussed a bit within the ftp team and several points came up.  The
> following describes my interpretation of it:
>
> The archive will need to do the final validation to check if an upload
> is accepted.  The uploaders signature would need to be added to the
> source package to allow checking the validity also in the future.  We
> already retain all user signatures of source packages in the archive and
> such a proposed service must provide the same level of possible
> verification.

I can certainly include a copy of the git signed tag object.  This
would require a modest change to dak to accept the new filename.  Can
you please tell me what filename would be good ?

> The signature needs to be collision resistant and needs to be verifyable
> with only the stuff included into the source package.  The git object
> checksums don't suffice anymore due to SHA1.  And as the world moves
> towards SHA3, it will need to have the ability to follow.  The output of
> all operations obviously needs to be reproducible to be signed.

The git signed tag object has a signature which is verifiable without
relying on the git object hash system.  The tag text directly contains
the source package name, and version, and intended upload target.

> I don't know if any of this requires a new dpkg source format to
> implement properly.

I don't think so.

Ian.

--
Ian Jackson <[hidden email]>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Ian Jackson-2
In reply to this post by Bastian Blank
Bastian Blank writes ("Re: tag2upload (git-debpush) service architecture - draft"):
> The hypothetical tool creates a complete .dsc file with the names and
> checksums of the uncompressed files.  The user signed .dsc is put into
> the tag.

This tool is almost exactly "dgit" and therefore already exists.  It
does parallel publication in the archive (.dsc) and git (signed tags).

The point of the tag2upload exercise is to move the .dsc generation
from the uploader's computer to a central service, because .dsc
generation is complicated, slow, and inconvenient.  So generating the
.dsc on the user's system defeats the object of the exercise.

Ian.

--
Ian Jackson <[hidden email]>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Ian Jackson-2
In reply to this post by Bernd Zeimetz
Ansgar writes ("Re: tag2upload (git-debpush) service architecture - draft"):
> There are also other issues, for example:
>
>  - Such a service would bypass various sanity checks on the archive
>    side, including various permission checks.

What permission checks are bypassed ?  The current service does expect
to perform the DD/DM check on behalf of the archive.  But that is
straightforward.

>  - Such a service would need to properly validate the PGP signature.
>    The archive really shouldn't rely on a third-party service for this.
>    (In particular the service in question here doesn't do that as far as
>    I can tell.)

My prototype already validates the PGP signature on the signed tag it
uses as its input and instructions.  That seemed obviously essential
to me even for a demo.  (Particularly as even in the demo in theory
the machinery could be subverted by a malicious salsa, otherwise.)

I had the code for that and the DM/DD permission check already,
because they were needed for the dgit git server, which already has
a permissions implementation equivalent to that of the archive (and
using the DAM-supplied data files for that purpose).

Perhaps I have misunderstood what you mean by "validate the PGP
signature".

Ian.

--
Ian Jackson <[hidden email]>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Bastian Blank
In reply to this post by Ian Jackson-2
Hi Ian

On Wed, Jul 31, 2019 at 05:08:51PM +0100, Ian Jackson wrote:
> Bastian Blank writes ("Re: tag2upload (git-debpush) service architecture - draft"):
> > The hypothetical tool creates a complete .dsc file with the names and
> > checksums of the uncompressed files.  The user signed .dsc is put into
> > the tag.
> The point of the tag2upload exercise is to move the .dsc generation
> from the uploader's computer to a central service, because .dsc
> generation is complicated, slow, and inconvenient.  So generating the
> .dsc on the user's system defeats the object of the exercise.

One last time:  The user has to certify his upload in a way the archive
can verify.

Now it is EOD from me.

Regards,
Bastian

--
All your people must learn before you can reach for the stars.
                -- Kirk, "The Gamesters of Triskelion", stardate 3259.2

Reply | Threaded
Open this post in threaded view
|

Re: tag2upload (git-debpush) service architecture - draft

Sean Whitton
In reply to this post by Rebecca N. Palmer-2
Hello,

On Wed 31 Jul 2019 at 07:53AM +01, Rebecca N. Palmer wrote:

> (c-scriptedstatusquo) git debpush becomes an automated way to do what is
> currently recommended, i.e. it creates and pushes a signed git tag (to
> salsa and to dgit), creates tarballs, creates and signs .dsc+.changes,
> dputs .dsc+.changes+tarball(s).  (This might be as simple as "dgit
> push-source && git push --all --follow-tags" [C], but I haven't tested
> that.)  tag2upload doesn't need to exist.

Just fyi, it is indeed as simple as those two commands.  However, when
there are errors, it is quite a bit harder to understand what's going on
than it is with git-debpush/tag2upload, basically because there are
.dscs involved.

(I don't think we'd want to make git-debpush a wrapper for that because
it is not a pure git command, so shouldn't be in the git-* namespace.)

--
Sean Whitton

signature.asc (847 bytes) Download Attachment
12