Adding support for LZIP to dpkg, using that instead of xz, archive wide

classic Classic list List threaded Threaded
69 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Adding support for LZIP to dpkg, using that instead of xz, archive wide

Thomas Goirand-3
Dear friends,

I've been using xz compression for a long time, but I see a big defect
which is today pushing me to turn it off for the .orig.tar file. The
issue is that depending on the version of xz-utils, it produces a
different output.

We use "git archive" within the PKG OpenStack team to generate this
tarball (which is more or less the same as pristine-tar, except we use
upstream tags rather than a pristine-tar branch). The fact that xz
produces a different result makes it not reproducible. As a consequence,
it is very hard for us to use this system across distributions (ie: use
that in both Debian and Ubuntu, or in Sid & Jessie). We need consistency.

As a friend puts it:

"This is a fundamental problem/defect with xz. This (and a lot of other
such defects, e.g. non-robustness of xz archives that easily lead to
file corruption etc) are the reason that there is lzip (and which is why
gnu.org has, on a technical basis, decided that lzip is official
gzip-successor for gnu software releases when they come in tarballs).

So it'd be super nice to have LZIP support in dpkg, and use that instead
of xz, archive wide.

Your thoughts everyone? Is there any reason why we wouldn't do that?

Cheers,

Thomas Goirand (zigo)


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/557BE879.5060505@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Paul Wise via nm
On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote:

> Is there any reason why we wouldn't do that?

It was already rejected by the dpkg maintainers twice.

https://bugs.debian.org/600094
https://bugs.debian.org/556960

--
bye,
pabs

https://wiki.debian.org/PaulWise


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/CAKTje6F51gA_R0oCCgxbS=V5nssBiEM5-iunVtSQu7j8Qf8KbA@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Thomas Goirand-3
On 06/13/2015 10:55 AM, Paul Wise wrote:

> On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote:
>> I've been using xz compression for a long time, but I see a big defect
>> which is today pushing me to turn it off for the .orig.tar file. The
>> issue is that depending on the version of xz-utils, it produces a
>> different output.
>>
>> We use "git archive" within the PKG OpenStack team to generate this
>> tarball (which is more or less the same as pristine-tar, except we use
>> upstream tags rather than a pristine-tar branch). The fact that xz
>> produces a different result makes it not reproducible. As a
>> consequence, it is very hard for us to use this system across
>> distributions (ie: use that in both Debian and Ubuntu, or in Sid &
>> Jessie). We need consistency.
>>
>> As a friend puts it:
>>
>> "This is a fundamental problem/defect with xz. This (and a lot of
>> other such defects, e.g. non-robustness of xz archives that easily
>> lead to file corruption etc) are the reason that there is lzip (and
>> which is why gnu.org has, on a technical basis, decided that lzip is
>> official gzip-successor for gnu software releases when they come in
>> tarballs).
>>
>> So it'd be super nice to have LZIP support in dpkg, and use that
>> instead of xz, archive wide.
>>
>> Your thoughts everyone? Is there any reason why we wouldn't do that?
>>
>> Cheers,
>>
>> Thomas Goirand (zigo)
>
> It was already rejected by the dpkg maintainers twice.
>
> https://bugs.debian.org/600094
> https://bugs.debian.org/556960

Reading these bugs, am I right that the archive already supports lzip
for the orig.tar file? Because that's my issue: I don't really mind if
we use xz for the compression of the .deb files, but I need consistency
when generating the orig.tar.

Though, I had a try, and it doesn't look like dpkg-source -x supports
the .lz format unfortunately.

Now, regarding the fact that the maintainer closed the bugs, I see 2
issues the way he did it.

1/ First, he sites the fact that lzip isn't popular enough as the only
reason (did I miss another point of argumentation?). Well, it's
backed-up by the GNU project as the successor of gzip, and also, I
believe Debian is influential enough so that we may not have to care
about it. Also, a wise technical choice of this kind shouldn't be driven
by a popularity contest.

2/ Guillem wrote "that's at the maintainer's discretion" (ie: to close
the bug). Well, here, the whole of Debian is depending on this kind of
decision, so I don't agree that this decision is only at the discretion
of the maintainer.

Therefore, I'm tempted to raise this to the technical committee (putting
their list as Cc). Does anyone see a reason why I am mistaking here?

Cheers,

Thomas Goirand (zigo)


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/557CB7ED.3060709@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Paul Wise via nm
On Sun, Jun 14, 2015 at 7:08 AM, Thomas Goirand wrote:

> Reading these bugs, am I right that the archive already supports lzip
> for the orig.tar file?

AFAICT, there is no mention of .lz or lzip in the dak source code.

--
bye,
pabs

https://wiki.debian.org/PaulWise


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/CAKTje6FgOyHXEu70i1zHmAOD0Ty9wZo=aH-+WZsnrNN1rW8v=w@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Guillem Jover
In reply to this post by Thomas Goirand-3
Hi,

On Sun, 2015-06-14 at 01:08:29 +0200, Thomas Goirand wrote:
> On 06/13/2015 10:55 AM, Paul Wise wrote:
> > On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote:
> >> I've been using xz compression for a long time, but I see a big defect
> >> which is today pushing me to turn it off for the .orig.tar file. The
> >> issue is that depending on the version of xz-utils, it produces a
> >> different output.

Well if you want reproducible output, then use the same tool version.
That's the equivalent of expecting that using a different gcc version
will give you the same output.

As long as the bitstream is compatible with previous versions, I don't
see it as a problem, and I'd expect such changes to be beneficial,
because say, they might allow making the encoder faster, or compress
better, etc.

> >> We use "git archive" within the PKG OpenStack team to generate this
> >> tarball (which is more or less the same as pristine-tar, except we use
> >> upstream tags rather than a pristine-tar branch). The fact that xz
> >> produces a different result makes it not reproducible. As a
> >> consequence, it is very hard for us to use this system across
> >> distributions (ie: use that in both Debian and Ubuntu, or in Sid &
> >> Jessie). We need consistency.

If you generate it once, as part of the release process, why do you
need to generate it on different systems with different versions? And
how does that have anything to do with what gets packaged in Debian.
For Debian you only need to generate it once, why would you want to
generate it anew every time you build a new Debian revision instead
of just reusing the same tarball that is on the archive, if you don't
keep source tarball releases around?

> >> As a friend puts it:
> >>
> >> "This is a fundamental problem/defect with xz. This (and a lot of
> >> other such defects, e.g. non-robustness of xz archives that easily
> >> lead to file corruption etc) are the reason that there is lzip (and
> >> which is why gnu.org has, on a technical basis, decided that lzip is
> >> official gzip-successor for gnu software releases when they come in
> >> tarballs).

TBH this smells like FUD. For example I've never heard of corruption in
.xz files due to non-robustness, I'd expect that corruption to come from
external forces, and that integrity would help or not detect it. In any
case .xz supports CRC32, CRC64 and SHA-256 for integrity checks, .lz only
supports CRC32. More over lzip was created to overcome limitations in the
.lzma format, .xz came later and fixed the limitations of the .lzma format
too.

(And I could probably switch dpkg-deb's .xz integrity check to CRC64,
given that's the xz-utils command-line tool default.)

Also many GNU projects do not release lzip tarballs, but do release bzip
or xz ones and there are very few that exclusively release lzip tarballs.
If that's the equivalent of bazaar being the official GNU VCS that most
of the GNU projects do not use, well…

Actually where is the gnu.org decision documented? I don't see it
neither in the GCS, the “Information for Maintainers of GNU Software”,
nor in the ftp.gnu.org site. And automake still defaults to dist-gz in
latest git.

  <http://www.gnu.org/prep/standards/>
  <http://www.gnu.org/prep/maintain/>

> >> So it'd be super nice to have LZIP support in dpkg, and use that
> >> instead of xz, archive wide.
> >>
> >> Your thoughts everyone? Is there any reason why we wouldn't do that?

Yes, replacing xz with lzip on .deb or .dsc packages does not make any
sense. Adding lzip support for source packages *might* make some sense, as
I pointed out in the bug report. But doing so does have a very high cost:

  <https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Can_we_add_support_for_new_compressors_for_.dsc_packages.3F>

Whenever considering to add a new compressor, all surrounding tools need
to be modified to support it as well:

  <https://wiki.debian.org/Teams/Dpkg/DebSupport>
  <https://wiki.debian.org/Teams/Dpkg/DscSupport>

That's a non-zero amount of work and time, and that does not take into
account external tools and users. It would also not be usable until the
next stable release. Also notice that for example there are still tools
that do not support data.tar.xz in .deb, which has been the default for
a while, which should give you an idea of what it takes.

Adding a new compressor, that does not bring any significant benefit in
compression ratio, speed or container format, that is either not widely
used or widely available in many systems, just for the benefit of very
few packages that might be releasing as well in other formats, or that
can be easily recompressed, still does not seem worth it, no.

I've yet to see an actual convincing argument why this would be worth
the effort and trouble.

Also not to mention that I was the first to also consider .lz when we
evaluated adding .xz support in dpkg back in 2009.

  <https://lists.debian.org/debian-dpkg/2009/10/msg00029.html>

> > It was already rejected by the dpkg maintainers twice.
> >
> > https://bugs.debian.org/600094
> > https://bugs.debian.org/556960
>
> Reading these bugs, am I right that the archive already supports lzip
> for the orig.tar file? Because that's my issue: I don't really mind if
> we use xz for the compression of the .deb files, but I need consistency
> when generating the orig.tar.

Nothing in the .deb/.dsc tooling supports lzip AFAIK. The archive does
not even support the .lzma format.

> Now, regarding the fact that the maintainer closed the bugs, I see 2
> issues the way he did it.

First, that was a bug report from *2009/2010*. I think I was clear in
my mail that I was open to reconsider if things changed in the future.

> 1/ First, he sites the fact that lzip isn't popular enough as the only
> reason (did I miss another point of argumentation?). Well, it's
> backed-up by the GNU project as the successor of gzip, and also, I
> believe Debian is influential enough so that we may not have to care
> about it. Also, a wise technical choice of this kind shouldn't be driven
> by a popularity contest.

No, that's the summary that Antonio wrote. It's not the only reason
I gave in that mail, it's a significant one, given its implications
(see the FAQ entry above):

 * There's already .xz support (as one of the lzma variants), .lzma is
   now deprecated for .deb compression.
 * I'd rather have consistency between source and binary compressors.
 * For source packages high usage might be a more important reason to
   _accept_ lzip (given that've got an equivalent or better lzma format
   with .xz), than low usage for a _reject_ (if we didn't have .xz).

Compressor formats are subject to network-effects like many other
file formats. In this case I think .xz "won" both because it was the
"official" successor from .lzma, and because it is superior to .lz.

Depending on the context, availability and usage (or popularity if you
will), are quite important aspects when deciding when to support such
formats. In other cases, you really want to support more format, for
example on a GUI archiving program, or on something like automake.
Discounting this as a simple matter of "fashion" is not helpful.

> 2/ Guillem wrote "that's at the maintainer's discretion" (ie: to close
> the bug). Well, here, the whole of Debian is depending on this kind of
> decision, so I don't agree that this decision is only at the discretion
> of the maintainer.

That was exclusively related to whether to keep a wishlist+wontfix report
open or closed. And of course the logical next step is instead to force
the issue through the ctte… while I've only seen lzip upstream and one
other person clamoring for lzip support, and no other dicussions in
debian-devel over this, since 2010.

> Therefore, I'm tempted to raise this to the technical committee (putting
> their list as Cc). Does anyone see a reason why I am mistaking here?

*Sigh* and yes…

Regards,
Guillem


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/20150614034559.GA10559@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Thomas Goirand-3
Guillem,

First, thanks for your reply and taking the time to reply on every
point. This really is helpful.

While I believe all of your argumentation is correct, I am still not
convince about the reproducibility, which is my main issue here. Could
you please reply to that point, and that one only? I've removed from the
quote all what doesn't concern it, because it is my feeling that it is a
distraction in this thread.

On 06/14/2015 05:46 AM, Guillem Jover wrote:

> Hi,
>
> On Sun, 2015-06-14 at 01:08:29 +0200, Thomas Goirand wrote:
>> On 06/13/2015 10:55 AM, Paul Wise wrote:
>>> On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote:
>>>> I've been using xz compression for a long time, but I see a big defect
>>>> which is today pushing me to turn it off for the .orig.tar file. The
>>>> issue is that depending on the version of xz-utils, it produces a
>>>> different output.
>
> Well if you want reproducible output, then use the same tool version.

That's not possible: Jessie, Sid and Trusty don't have the same version,
and we need to generate the orig.tar file in all of them. The
contributors for the Debian OpenStack packaging are mostly using Ubuntu,
and they need to keep a workflow with the orig.tar file in the Git
repository.

I did tell them to just get the file from the Debian archive, but it
doesn't work. One of the reason it doesn't is because sometimes, they
upload first to Ubuntu, and then I do in Debian, and we end up with
different orig.tar.xz files, meaning it's hard for them to sync back
with Debian. I would like this to not be an issue anymore.

> That's the equivalent of expecting that using a different gcc version
> will give you the same output.

I fail to see what gcc and a lossless compressor have in common.

> As long as the bitstream is compatible with previous versions, I don't
> see it as a problem

The problem, I just explained it: I can't use xz in a pristine-tar like
workflow, because it wouldn't reproduce the same output. And I'd like to
use something better than the 20 years old gzip.

>>>> We use "git archive" within the PKG OpenStack team to generate this
>>>> tarball (which is more or less the same as pristine-tar, except we use
>>>> upstream tags rather than a pristine-tar branch). The fact that xz
>>>> produces a different result makes it not reproducible. As a
>>>> consequence, it is very hard for us to use this system across
>>>> distributions (ie: use that in both Debian and Ubuntu, or in Sid &
>>>> Jessie). We need consistency.
>
> If you generate it once, as part of the release process, why do you
> need to generate it on different systems with different versions?

Because I'd like the Git repository to contain it, without the need to
pick-up the file from the Debian archive. And to be exact: that's mostly
a need from contributors, I could live with the issue, but they can't.
This is mostly a need expressed by Ubuntu/Canonical server team working
with me on OpenStack packaging on Alioth.

> And how does that have anything to do with what gets packaged in Debian.
> For Debian you only need to generate it once, why would you want to
> generate it anew every time you build a new Debian revision instead
> of just reusing the same tarball that is on the archive, if you don't
> keep source tarball releases around?

See above. It's a pristine-tar like workflow. Your question is
equivalent to: "why do people use pristine-tar?". The answer is: because
it's convenient to just use git, without having to look into the Debian
archive. And by the way, xz wouldn't be usable with pristine-tar for the
same reason.

>>>> So it'd be super nice to have LZIP support in dpkg, and use that
>>>> instead of xz, archive wide.
>>>>
>>>> Your thoughts everyone? Is there any reason why we wouldn't do that?
>
> Yes, replacing xz with lzip on .deb or .dsc packages does not make any
> sense.

That isn't what I care about. I only care about the orig.tar file here.

> Adding lzip support for source packages *might* make some sense, as
> I pointed out in the bug report. But doing so does have a very high cost:
>
>   <https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Can_we_add_support_for_new_compressors_for_.dsc_packages.3F>

I do understand the cost. But there's a valid reason. If you believe
there's something better than lz, with the same properties, and that we
had support for it, I'd happily adopt it. It is just that xz doesn't
work right now, and most likely will break again in future versions of
xz-utils.

> Whenever considering to add a new compressor, all surrounding tools need
> to be modified to support it as well:
>
>   <https://wiki.debian.org/Teams/Dpkg/DebSupport>
>   <https://wiki.debian.org/Teams/Dpkg/DscSupport>
>
> That's a non-zero amount of work and time, and that does not take into
> account external tools and users. It would also not be usable until the
> next stable release. Also notice that for example there are still tools
> that do not support data.tar.xz in .deb, which has been the default for
> a while, which should give you an idea of what it takes.
>
> Adding a new compressor, that does not bring any significant benefit in
> compression ratio, speed or container format, that is either not widely
> used or widely available in many systems, just for the benefit of very
> few packages that might be releasing as well in other formats, or that
> can be easily recompressed, still does not seem worth it, no.

Well, xz can't be used for pristine-tar, and gzip is old and doesn't
compress as well. This alone is IMO a good reason.

Thomas Goirand (zigo)

P.S: I'd prefer a consensus here than a CTTE bug.


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/557D6EB8.4090502@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Felipe Sateler-2
On Sun, 14 Jun 2015 14:08:24 +0200, Thomas Goirand wrote:

> And by the way, xz wouldn't be usable with pristine-tar for the same
> reason.

Ehm. pristine-xz(1) would beg to disagree.

In the multimedia team, we use it for over 40 packages (where upstream
provides an xz file of course).

I guess you should have a script that does git archive ; pristine-tar
commit.

--
Saludos,
Felipe Sateler


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/mlk1sh$v1b$1@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Vincent Lefevre-10
In reply to this post by Guillem Jover
I'm currently using xz for my own files, but...

On 2015-06-14 05:46:00 +0200, Guillem Jover wrote:

> On Sun, 2015-06-14 at 01:08:29 +0200, Thomas Goirand wrote:
> > On 06/13/2015 10:55 AM, Paul Wise wrote:
> > > On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote:
> > >> As a friend puts it:
> > >>
> > >> "This is a fundamental problem/defect with xz. This (and a lot of
> > >> other such defects, e.g. non-robustness of xz archives that easily
> > >> lead to file corruption etc) are the reason that there is lzip (and
> > >> which is why gnu.org has, on a technical basis, decided that lzip is
> > >> official gzip-successor for gnu software releases when they come in
> > >> tarballs).
>
> TBH this smells like FUD. For example I've never heard of corruption in
> .xz files due to non-robustness, I'd expect that corruption to come from
> external forces, and that integrity would help or not detect it.

xz-utils (4.999.9beta-1) experimental; urgency=low

  [ Jonathan Nieder ]
  * New upstream release.
     - Fix a data corruption in the compression code. (Closes: #544872)
[...]

But of course, this is old, and any compression software can have the
same kind of bug (possibly unless proved formally).

However lzip compresses better, sometimes much better:

-rw-r----- 1 vinc17 vinc17   822474 2015-04-26 00:45:51 mail.log.lz
-rw-r----- 1 vinc17 vinc17   915544 2015-04-26 00:45:51 mail.log.xz

(this example is a postfix mail log) and uses much less memory for
compression:

$ sh -c 'ulimit -v 200000; lzip -9 < mail.log > /dev/null'
$ sh -c 'ulimit -v 800000; xz -9 < mail.log > /dev/null'
xz: (stdin): Cannot allocate memory
$ sh -c 'ulimit -v 800000; xz -9 < /dev/null > /dev/null'
xz: (stdin): Cannot allocate memory

Note: see the 200000 for lzip and 800000 for xz.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/20150614144821.GB746@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Don Armstrong
In reply to this post by Thomas Goirand-3
On Sun, 14 Jun 2015, Thomas Goirand wrote:
> Therefore, I'm tempted to raise this to the technical committee
> (putting their list as Cc). Does anyone see a reason why I am
> mistaking here?

Does a patch exist which can enable lz for orig.tar?

Otherwise, I guess some of us could be involved to help clarify
communication, but anyone can do that, really.

--
Don Armstrong                      http://www.donarmstrong.com



--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/20150614151027.GB5611@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Guillem Jover
In reply to this post by Vincent Lefevre-10
Hi!

On Sun, 2015-06-14 at 16:48:21 +0200, Vincent Lefevre wrote:

> On 2015-06-14 05:46:00 +0200, Guillem Jover wrote:
> > On Sun, 2015-06-14 at 01:08:29 +0200, Thomas Goirand wrote:
> > > On 06/13/2015 10:55 AM, Paul Wise wrote:
> > > > On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote:
> > > >> As a friend puts it:
> > > >>
> > > >> "This is a fundamental problem/defect with xz. This (and a lot of
> > > >> other such defects, e.g. non-robustness of xz archives that easily
> > > >> lead to file corruption etc) are the reason that there is lzip (and
> > > >> which is why gnu.org has, on a technical basis, decided that lzip is
> > > >> official gzip-successor for gnu software releases when they come in
> > > >> tarballs).
> >
> > TBH this smells like FUD. For example I've never heard of corruption in
> > .xz files due to non-robustness, I'd expect that corruption to come from
> > external forces, and that integrity would help or not detect it.
>
> xz-utils (4.999.9beta-1) experimental; urgency=low
>
>   [ Jonathan Nieder ]
>   * New upstream release.
>      - Fix a data corruption in the compression code. (Closes: #544872)
> [...]
>
> But of course, this is old,

Yes, that was even before dpkg started to use xz-utils to handle .xz
files.

> and any compression software can have the
> same kind of bug (possibly unless proved formally).

And in any case I don't see how this is a "fundamental problem" with
the format, this is simply just a bug in a beta version, although an
unfortunate one.

> However lzip compresses better, sometimes much better:
>
> -rw-r----- 1 vinc17 vinc17   822474 2015-04-26 00:45:51 mail.log.lz
> -rw-r----- 1 vinc17 vinc17   915544 2015-04-26 00:45:51 mail.log.xz

Oh, interesting, this didn't use to be the case when we added .xz
support to dpkg.

> (this example is a postfix mail log) and uses much less memory for
> compression:
>
> $ sh -c 'ulimit -v 200000; lzip -9 < mail.log > /dev/null'
> $ sh -c 'ulimit -v 800000; xz -9 < mail.log > /dev/null'
> xz: (stdin): Cannot allocate memory
> $ sh -c 'ulimit -v 800000; xz -9 < /dev/null > /dev/null'
> xz: (stdin): Cannot allocate memory
>
> Note: see the 200000 for lzip and 800000 for xz.

The preset levels do not match between lzip and xz. For example for -9, xz
uses a dictionary size of 64 MiB, while lzip uses 32 MiB. Other parameters
are also probably quite different. In addition lzip seems to be
substantially slower (at least) when compressing compared to xz using the
same preset levels. With a small pdf file it took more than twice the time:

,---
$ cat Posix_1003.1e-990310.pdf >/dev/null
$ ls -la Posix_1003.1e-990310.pdf
-rw-r----- 1 guillem guillem 3486116 Feb 20 16:43 Posix_1003.1e-990310.pdf
$ /usr/bin/time xz -9k Posix_1003.1e-990310.pdf
1.24user 0.07system 0:01.31elapsed 99%CPU (0avgtext+0avgdata 98748maxresident)k
0inputs+3520outputs (0major+24291minor)pagefaults 0swaps
$ rm -f Posix_1003.1e-990310.pdf.xz
$ /usr/bin/time xz -9k Posix_1003.1e-990310.pdf
1.25user 0.06system 0:01.31elapsed 99%CPU (0avgtext+0avgdata 98952maxresident)k
0inputs+3520outputs (0major+24295minor)pagefaults 0swaps
$ ls -la Posix_1003.1e-990310.pdf.xz
-rw-r----- 1 guillem guillem 1801372 Feb 20 16:43 Posix_1003.1e-990310.pdf.xz
$ rm -f Posix_1003.1e-990310.pdf.xz
#
$ /usr/bin/time lzip -9k Posix_1003.1e-990310.pdf
2.93user 0.02system 0:02.96elapsed 99%CPU (0avgtext+0avgdata 37628maxresident)k
0inputs+3520outputs (0major+8957minor)pagefaults 0swaps
$ rm -f Posix_1003.1e-990310.pdf.lz
$ /usr/bin/time lzip -9k Posix_1003.1e-990310.pdf
2.94user 0.03system 0:02.98elapsed 99%CPU (0avgtext+0avgdata 37576maxresident)k
0inputs+3520outputs (0major+8955minor)pagefaults 0swaps
-rw-r----- 1 guillem guillem 1798338 Feb 20 16:43 Posix_1003.1e-990310.pdf.lz
$ rm -f Posix_1003.1e-990310.pdf.lz
`---

With the linux sources:

,---
$ cat linux_4.0.4.orig.tar >/dev/null
$ ls -la linux_4.0.4.orig.tar
-rw-r--r-- 1 guillem guillem 582932480 May 26 20:15 linux_4.0.4.orig.tar
$ /usr/bin/time lzip -k9 linux_4.0.4.orig.tar
619.52user 1.27system 10:21.95elapsed 99%CPU (0avgtext+0avgdata 363168maxresident)k
24inputs+156680outputs (0major+90387minor)pagefaults 0swaps
$ ls -la linux_4.0.4.orig.tar.lz
-rw-r--r-- 1 guillem guillem 80218126 May 26 20:15 linux_4.0.4.orig.tar.lz
$ rm -f linux_4.0.4.orig.tar.lz
$ /usr/bin/time lzip -k9 linux_4.0.4.orig.tar
618.94user 1.10system 10:21.02elapsed 99%CPU (0avgtext+0avgdata 363180maxresident)k
8inputs+156680outputs (0major+90389minor)pagefaults 0swaps
$ rm -f linux_4.0.4.orig.tar.lz
#
$ /usr/bin/time xz -k9 linux_4.0.4.orig.tar
514.76user 1.53system 8:37.22elapsed 99%CPU (0avgtext+0avgdata 691428maxresident)k
176inputs+156656outputs (1major+172417minor)pagefaults 0swaps
$ ls -la linux_4.0.4.orig.tar.xz
-rw-r--r-- 1 guillem guillem 80205900 May 26 20:15 linux_4.0.4.orig.tar.xz
$ rm -f linux_4.0.4.orig.tar.xz
$ /usr/bin/time xz -k9 linux_4.0.4.orig.tar
515.96user 1.62system 8:38.60elapsed 99%CPU (0avgtext+0avgdata 691328maxresident)k
56inputs+156656outputs (0major+172413minor)pagefaults 0swaps
$ rm -f linux_4.0.4.orig.tar.xz
`---

So the comparison does not seem entirely fair. And it seems to me to be
a matter of tradeoffs?

Thanks,
Guillem


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/20150615030446.GA22199@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Thomas Goirand-3
In reply to this post by Felipe Sateler-2
On 06/14/2015 04:08 PM, Felipe Sateler wrote:

> On Sun, 14 Jun 2015 14:08:24 +0200, Thomas Goirand wrote:
>
>> And by the way, xz wouldn't be usable with pristine-tar for the same
>> reason.
>
> Ehm. pristine-xz(1) would beg to disagree.
>
> In the multimedia team, we use it for over 40 packages (where upstream
> provides an xz file of course).
>
> I guess you should have a script that does git archive ; pristine-tar
> commit.

Did you try using the same pristine-tar xz thing but with a different
version of xz-utils, for example the one in Trusty vs the one in Sid?

Cheers,

Thomas Goirand (zigo)


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/557E87B6.9020307@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Thomas Goirand-3
In reply to this post by Guillem Jover
On 06/15/2015 05:04 AM, Guillem Jover wrote:
> In addition lzip seems to be
> substantially slower (at least) when compressing compared to xz using the
> same preset levels.

I understand that some may care about it, but as for me, I couldn't care
less about the time taken for compressing. What I need is
reproducibility. Right now, I'm switching back to .gz, which is
disappointing.

Thomas


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/557E8A0A.2070808@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Thomas Goirand-3
In reply to this post by Don Armstrong
On 06/14/2015 05:10 PM, Don Armstrong wrote:
> On Sun, 14 Jun 2015, Thomas Goirand wrote:
>> Therefore, I'm tempted to raise this to the technical committee
>> (putting their list as Cc). Does anyone see a reason why I am
>> mistaking here?
>
> Does a patch exist which can enable lz for orig.tar?

Isn't it what this is doing?

https://bugs.debian.org/600094
https://bugs.debian.org/556960

> Otherwise, I guess some of us could be involved to help clarify
> communication, but anyone can do that, really.

I'm not at a stage where I want to involve the CTTE right now. I still
would prefer to gather opinions and see where it goes.

Thomas Goirand (zigo)


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/557E9503.6010109@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Jonathan Dowland
In reply to this post by Thomas Goirand-3
On Sun, Jun 14, 2015 at 01:08:29AM +0200, Thomas Goirand wrote:
> Therefore, I'm tempted to raise this to the technical committee (putting
> their list as Cc). Does anyone see a reason why I am mistaking here?

Well, both bugs are over 5 years old. It would be probably wise to have a
more modern dialogue with the maintainer before considering the tech-ctte.


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/20150615091508.GA21918@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Aron Xu-3
In reply to this post by Thomas Goirand-3
On Mon, Jun 15, 2015 at 5:04 PM, Thomas Goirand <[hidden email]> wrote:

> On 06/14/2015 05:10 PM, Don Armstrong wrote:
>> On Sun, 14 Jun 2015, Thomas Goirand wrote:
>>> Therefore, I'm tempted to raise this to the technical committee
>>> (putting their list as Cc). Does anyone see a reason why I am
>>> mistaking here?
>>
>> Does a patch exist which can enable lz for orig.tar?
>
> Isn't it what this is doing?
>
> https://bugs.debian.org/600094
> https://bugs.debian.org/556960
>
>> Otherwise, I guess some of us could be involved to help clarify
>> communication, but anyone can do that, really.
>
> I'm not at a stage where I want to involve the CTTE right now. I still
> would prefer to gather opinions and see where it goes.
>

I don't hold a view on whether we want lz support in dpkg/dak, but it
could be a pity if we really involve CTTE for such an issue. To me,
it's sorta abusing the escalation process if every individual
developer raise an issue and seek for overruling when his opinion were
n'acked by the maintainer of relevant part... But this is just my own,
secret, POV, please don't start a flame war for it.

Thanks,
Aron


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/CAMr=8w58Q5cVbQ4nvTdEFt6_iGwmgC8QKSSkGb66+Gy04YNL4A@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Sam Hartman-3
>>>>> "Aron" == Aron Xu <[hidden email]> writes:

    Aron> I don't hold a view on whether we want lz support in dpkg/dak,
    Aron> but it could be a pity if we really involve CTTE for such an
    Aron> issue. To me, it's sorta abusing the escalation process if
    Aron> every individual developer raise an issue and seek for
    Aron> overruling when his opinion were n'acked by the maintainer of
    Aron> relevant part... But this is just my own, secret, POV, please
    Aron> don't start a flame war for it.

hi.  Speaking as a member of the TC, I'd really like to encourage people
to not think of coming to the TC only to get stuff overruled.  we can be
a resource for helping people understand why communication is not
working and helping people understand each other.  If someone feels that
their input was not heard, I'd rather them ask for help with that than
have lingering frustrations build up.  Not being heard is different from
not being agreed with.  When I'm not heard I'm typically frustrated
because I don't have confidence that the people making the decision
considered what I was saying or frustrated because the rationale for the
decision doesn't address some objection I raised.
That tends to cause me to think about whether I should spend my time
elsewhere.
In contrast, I end up not being agreed with all the time.  I can
understand the tradeoffs people are making, and can value the decision
and process even when it is not one I'd make.

Now, it sounds like involving the TC is premature here on either ground,
but we can help do more than just overrule stuff.


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/0000014df726c676-b67b0bd3-267c-4725-966c-ca7b7536220d-000000@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Felipe Sateler-2
In reply to this post by Thomas Goirand-3
On Mon, 15 Jun 2015 10:07:18 +0200, Thomas Goirand wrote:

> On 06/14/2015 04:08 PM, Felipe Sateler wrote:
>> On Sun, 14 Jun 2015 14:08:24 +0200, Thomas Goirand wrote:
>>
>>> And by the way, xz wouldn't be usable with pristine-tar for the same
>>> reason.
>>
>> Ehm. pristine-xz(1) would beg to disagree.
>>
>> In the multimedia team, we use it for over 40 packages (where upstream
>> provides an xz file of course).
>>
>> I guess you should have a script that does git archive ; pristine-tar
>> commit.
>
> Did you try using the same pristine-tar xz thing but with a different
> version of xz-utils, for example the one in Trusty vs the one in Sid?

I have just checked out a tarball checked in on 2013-05-20 using current
sid. SHAs match. I do not have a trusty system to check.

--
Saludos,
Felipe Sateler


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/mlmh5k$pif$1@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Marco d'Itri
In reply to this post by Thomas Goirand-3
On Jun 15, Thomas Goirand <[hidden email]> wrote:

> I'm not at a stage where I want to involve the CTTE right now. I still
> would prefer to gather opinions and see where it goes.
My opinion is that you have not proved either that lz is widely used or
that it is "better" than xz.

--
ciao,
Marco

attachment0 (662 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

riku voipio-7
In reply to this post by Guillem Jover
On Monday, June 15, 2015 6:04:46 AM EEST, Guillem Jover wrote:
> So the comparison does not seem entirely fair. And it seems to me to be
> a matter of tradeoffs?

Since both lzip and xz are implementations of same LZMA algorithm, it seems
lzip is just parametrized different. For some usecases, like plaintext mail
logfiles it seems better - but there is probably other places were the
tradeoff gives worse results.

I don't mind adding yet-another-compressor for source packages. But a
really hope this doesn't come part of binary debs - it would be
unnneccesary bloat to have dpkg depend on two different lzma
implementations.

Riku


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/40857c30-6d67-4151-ad4d-3dde39979a5a@...

Reply | Threaded
Open this post in threaded view
|

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide

Thomas Goirand-3
In reply to this post by Jonathan Dowland
On 06/15/2015 11:15 AM, Jonathan Dowland wrote:
> On Sun, Jun 14, 2015 at 01:08:29AM +0200, Thomas Goirand wrote:
>> Therefore, I'm tempted to raise this to the technical committee (putting
>> their list as Cc). Does anyone see a reason why I am mistaking here?
>
> Well, both bugs are over 5 years old. It would be probably wise to have a
> more modern dialogue with the maintainer before considering the tech-ctte.

Which is what I'm doing right now in this thread.

Thomas


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]
Archive: https://lists.debian.org/557EE25E.1040207@...

1234