apt-get may accept inconsistent data

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

apt-get may accept inconsistent data

Stefan Tichy-3
Hi,

the problem may be the result of proxy usage or even improper proxy
configuration, but apt-get should complain if something is wrong.

Etch is installed on the system and "apt-get update" did fetch
Release and Release.gpg from security.debian.org (modification date
2008-05-02). The packages file is still an old version from 2008-04-27.

"OK   http://security.debian.org etch/updates/main Packages"
is part of the output. Just OK, not fetched, but this is no reason
to assume that some error has occured.

Did I miss something, should I check apt config or is this a bug?

Kind regards

--
Stefan Tichy   ( dlist at pi4tel dot de )


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Goswin von Brederlow-2
Stefan Tichy <[hidden email]> writes:

> Hi,
>
> the problem may be the result of proxy usage or even improper proxy
> configuration, but apt-get should complain if something is wrong.
>
> Etch is installed on the system and "apt-get update" did fetch
> Release and Release.gpg from security.debian.org (modification date
> 2008-05-02). The packages file is still an old version from 2008-04-27.
>
> "OK   http://security.debian.org etch/updates/main Packages"
> is part of the output. Just OK, not fetched, but this is no reason
> to assume that some error has occured.
>
> Did I miss something, should I check apt config or is this a bug?
>
> Kind regards

Does it complain about the md5sum/size of the file?

The Release file could have been regenerated for e.g. a contrib
package without main changing. So the date does not neccesarily mean
anything.

MfG
        Goswin


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Stefan Tichy-3
On Sat, May 03, 2008 at 10:17:00PM +0200, Goswin von Brederlow wrote:
> Does it complain about the md5sum/size of the file?

No, it seems to be perfectly satisfied. No error message, exit
status 0.


> The Release file could have been regenerated for e.g. a contrib
> package without main changing. So the date does not neccesarily mean
> anything.

So there is no reason to be surprised if just Release and
Release.gpg where fetched and updated.

But the MD5SUM of the packages file is not listed in the Release
file (probably it was listed in some old version).

I tried 'Acquire::Pdiffs "false";' but that does not help.

If I remove proxy configuration from apt.conf, apt-get fetches
the new packages file. The problem probably does not occur if there
is no (squid) proxy, but apt-get should complain if anything goes
wrong for whatever reason.



--
Stefan Tichy   ( dlist at pi4tel dot de )


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Goswin von Brederlow-2
Stefan Tichy <[hidden email]> writes:

> On Sat, May 03, 2008 at 10:17:00PM +0200, Goswin von Brederlow wrote:
>> Does it complain about the md5sum/size of the file?
>
> No, it seems to be perfectly satisfied. No error message, exit
> status 0.
>
>
>> The Release file could have been regenerated for e.g. a contrib
>> package without main changing. So the date does not neccesarily mean
>> anything.
>
> So there is no reason to be surprised if just Release and
> Release.gpg where fetched and updated.
>
> But the MD5SUM of the packages file is not listed in the Release
> file (probably it was listed in some old version).

That also triggers an error from apt-get. I just had that 2 days ago
when I messed up a locally generated Release file.

What file are you talking about?

> I tried 'Acquire::Pdiffs "false";' but that does not help.
>
> If I remove proxy configuration from apt.conf, apt-get fetches
> the new packages file. The problem probably does not occur if there
> is no (squid) proxy, but apt-get should complain if anything goes
> wrong for whatever reason.

Does the file actually differ?

Could you strace apt-get and see what the http method sends and
recieves from squid and apt-get?

MfG
        Goswin


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Stefan Tichy-3
On Sun, May 04, 2008 at 06:50:35PM +0200, Goswin von Brederlow wrote:
> Does the file actually differ?

security.debian.org_dists_etch_updates_main_binary-i386_Packages

Yes, it has been modified.


> Could you strace apt-get and see what the http method sends and
> recieves from squid and apt-get?

tcpdump and wireshark did help.


apt-get sends a http GET request for Packages.bz2. Part of this
request is this information:

  Cache-Control: max-age=0  If-Modified-Since: Sun, 27 Apr 2008 09:15:01 GMT


Squid response:

  HTTP/1.0 304 Not Modified  Date: Sun, 04 May 2008 16:34:28
        GMT  Server Apache/2.2.3 ( Debian ) .... Cache: HIT from
        servername  ... Proxy-Connection: close

The proxy has the current version (2008-05-02) in the cache.


Squid 3.0.PRE5-5 seems to be reponsible for this problem, but
IMHO apt-get should be able to recognize it.


--
Stefan Tichy   ( dlist at pi4tel dot de )


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Goswin von Brederlow-2
Stefan Tichy <[hidden email]> writes:

> On Sun, May 04, 2008 at 06:50:35PM +0200, Goswin von Brederlow wrote:
>> Does the file actually differ?
>
> security.debian.org_dists_etch_updates_main_binary-i386_Packages
>
> Yes, it has been modified.

I ment what Release file. Because the etch security one does have the
md5sums of Packages in it.

>> Could you strace apt-get and see what the http method sends and
>> recieves from squid and apt-get?
>
> tcpdump and wireshark did help.
>
>
> apt-get sends a http GET request for Packages.bz2. Part of this
> request is this information:
>
>   Cache-Control: max-age=0  If-Modified-Since: Sun, 27 Apr 2008 09:15:01 GMT
>
>
> Squid response:
>
>   HTTP/1.0 304 Not Modified  Date: Sun, 04 May 2008 16:34:28
> GMT  Server Apache/2.2.3 ( Debian ) .... Cache: HIT from
> servername  ... Proxy-Connection: close
>
> The proxy has the current version (2008-05-02) in the cache.
>
>
> Squid 3.0.PRE5-5 seems to be reponsible for this problem, but
> IMHO apt-get should be able to recognize it.

So squid is to blame for apt not getting the new one.

But you are right. There is something wrong here that is not squids
fault:

Apt-get should not even send an "If-Modified" query imho. After
fetching the Release file is already knows with near certainty if the
local file is current or not. It should check the Checksums of the
local file and then either keep it or fetch it. Asking
If-Modified-Since can only lead to triggering a bug like the squid
one.

MfG
        Goswin


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Bernd Eckenfels
In article <[hidden email]> you wrote:
> Apt-get should not even send an "If-Modified" query imho. After
> fetching the Release file is already knows with near certainty if the
> local file is current or not. It should check the Checksums of the
> local file and then either keep it or fetch it. Asking
> If-Modified-Since can only lead to triggering a bug like the squid
> one.

It would be possible to not base the if-modifed-since on the file time but
on a date header inside the file. But in that case the mirrors will have to
react reasonable well to that.

Gruss
Bernd


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Goswin von Brederlow-2
Bernd Eckenfels <[hidden email]> writes:

> In article <[hidden email]> you wrote:
>> Apt-get should not even send an "If-Modified" query imho. After
>> fetching the Release file is already knows with near certainty if the
>> local file is current or not. It should check the Checksums of the
>> local file and then either keep it or fetch it. Asking
>> If-Modified-Since can only lead to triggering a bug like the squid
>> one.
>
> It would be possible to not base the if-modifed-since on the file time but
> on a date header inside the file. But in that case the mirrors will have to
> react reasonable well to that.
>
> Gruss
> Bernd

What should that give you? Either the file already has the right
checksum (don't fetch) or not (fetch).

The only case an If-Modified-Since would help is when the mirror is
broken and the Packages file actualy does not match. apt-get would
keep trying to fetch it on every invocation even if it isn't new.

I'm not even sure I wouldn't want that in case the file was corrupted
during transfer or locally.

MfG
        Goswin


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Stefan Tichy-3
In reply to this post by Goswin von Brederlow-2
On Mon, May 05, 2008 at 01:03:33AM +0200, Goswin von Brederlow wrote:
> I ment what Release file. Because the etch security one does have the
> md5sums of Packages in it.

This has been modified too and the md5sum listed for the packages
file has changed.

> > apt-get sends a http GET request for Packages.bz2. Part of this
> > request is this information:
> >
> >   Cache-Control: max-age=0  If-Modified-Since: Sun, 27 Apr 2008 09:15:01 GMT

Cache-Control: max-age=0
If-Modified-Since: Sun, 27 Apr 2008 09:15:01 GMT


If apt-get would send this instead, squid would work as expected:

Cache-Control: must-revalidate
If-Modified-Since: Sun, 27 Apr 2008 09:15:01 GMT


a possible workaround is use "apt-get update" with additional option:

apt-get update -o "Acquire::http::No-Cache=True"


> Apt-get should not even send an "If-Modified" query imho. After
> fetching the Release file is already knows with near certainty if the
> local file is current or not. It should check the Checksums of the
> local file and then either keep it or fetch it. Asking
> If-Modified-Since can only lead to triggering a bug like the squid
> one.

You are right. There is no need to ask any proxy and to rely on the
answer, because apt-get should be able to find out what to do.


--
Stefan Tichy   ( dlist at pi4tel dot de )


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Bjørn Mork-2
Stefan Tichy <[hidden email]> writes:

> On Mon, May 05, 2008 at 01:03:33AM +0200, Goswin von Brederlow wrote:
>> I ment what Release file. Because the etch security one does have the
>> md5sums of Packages in it.
>
> This has been modified too and the md5sum listed for the packages
> file has changed.
>
>> > apt-get sends a http GET request for Packages.bz2. Part of this
>> > request is this information:
>> >
>> >   Cache-Control: max-age=0  If-Modified-Since: Sun, 27 Apr 2008 09:15:01 GMT
>
> Cache-Control: max-age=0
> If-Modified-Since: Sun, 27 Apr 2008 09:15:01 GMT
>
>
> If apt-get would send this instead, squid would work as expected:
>
> Cache-Control: must-revalidate

must-revalidate is only valid in a server response.  See RFC2612 section
14.9.  Using "Cache-Control: max-age=0" is the correct way for a client
to force cache revalidation.  This sounds like a squid bug.  You may
work around it, but let's just face it: If you accept a buggy proxy,
then there is no way to ensure valid content.



Bjørn
--
I mean, your narrow-mindedness is matched only by your
narrow-mindedness


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Cameron Dale-2
In reply to this post by Goswin von Brederlow-2
On 5/4/08, Goswin von Brederlow <[hidden email]> wrote:
>  But you are right. There is something wrong here that is not squids
>  fault:
>
>  Apt-get should not even send an "If-Modified" query imho. After
>  fetching the Release file is already knows with near certainty if the
>  local file is current or not. It should check the Checksums of the
>  local file and then either keep it or fetch it. Asking
>  If-Modified-Since can only lead to triggering a bug like the squid
>  one.

Having just implemented something like this in my apt-p2p program, I
can tell you that this is definitely possible. But, in doing it I
learned why I think apt does not use this method, which may be some
combination of these issues:

1) apt doesn't store much state between runs, including not storing
the hashes of downloaded files

2) there's no guarantee that a file is unchanged when apt is run again

3) getting an HTTP 304 response may be faster than hashing a 20 MB
file, especially considering that a request may need to be sent after
finding an out of date hash

4) apt downloads compressed Packages files, but only stores the
uncompressed ones

None of these issues are insurmountable of course, but the issue is
more complicated than it at first seems.

Cameron


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Goswin von Brederlow-2
"Cameron Dale" <[hidden email]> writes:

> On 5/4/08, Goswin von Brederlow <[hidden email]> wrote:
>>  But you are right. There is something wrong here that is not squids
>>  fault:
>>
>>  Apt-get should not even send an "If-Modified" query imho. After
>>  fetching the Release file is already knows with near certainty if the
>>  local file is current or not. It should check the Checksums of the
>>  local file and then either keep it or fetch it. Asking
>>  If-Modified-Since can only lead to triggering a bug like the squid
>>  one.
>
> Having just implemented something like this in my apt-p2p program, I
> can tell you that this is definitely possible. But, in doing it I
> learned why I think apt does not use this method, which may be some
> combination of these issues:
>
> 1) apt doesn't store much state between runs, including not storing
> the hashes of downloaded files

We don't even talk between runs here but lets keep that in mind.

If apt get fetches a new Release file maybe it should flag the
existing files as "old". It could compare the old and new Release file
to see if any files remained the same and keep them or recheck their
checksums. But it should at least once validate the Packages/Sources
files against the Release file if it fetches a new one. Due to the
squid bug it never does.

> 2) there's no guarantee that a file is unchanged when apt is run again

So maybe it should always be checked on update? If the time for this
is really that valuable make it an option defaulting to on.

> 3) getting an HTTP 304 response may be faster than hashing a 20 MB
> file, especially considering that a request may need to be sent after
> finding an out of date hash

It may be faster but not authorative. Also on 99.9% of all systems the
time to checksum 20MB is neglible. And on others it is probably
insignificant compared to a following apt-get upgrade call.

> 4) apt downloads compressed Packages files, but only stores the
> uncompressed ones

It also downloads diff files nowadays.

I wonder if the user had diff files deactivated. Or does apt-get check
via HTPP if the Packages file needs updating before fetching the diffs?

> None of these issues are insurmountable of course, but the issue is
> more complicated than it at first seems.
>
> Cameron

MfG
        Goswin


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Cameron Dale-2
On 5/7/08, Goswin von Brederlow <[hidden email]> wrote:
> "Cameron Dale" <[hidden email]> writes:
>  > 3) getting an HTTP 304 response may be faster than hashing a 20 MB
>  > file, especially considering that a request may need to be sent after
>  > finding an out of date hash
>
> It may be faster but not authorative. Also on 99.9% of all systems the
>  time to checksum 20MB is neglible. And on others it is probably
>  insignificant compared to a following apt-get upgrade call.

It should be authoritative, the only reason it's not would be a broken
proxy, which isn't really apt's or the mirror's fault.

For the record, on a reasonably fast machine:

$ time sha1sum \
       ftp.us.debian.org_debian_dists_unstable_main_binary-amd64_Packages
cff59b58caf8b870f9514bf907a365b262b6a9bc
ftp.us.debian.org_debian_dists_unstable_main_binary-amd64_Packages

real    0m0.901s
user    0m0.288s
sys     0m0.076s

That's longer than a 304 request would take to come back, and if the
hash is old then a download request would have to be sent anyway,
whereas the original request would return the new file immediately.

Cameron


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Stefan Tichy-3
In reply to this post by Cameron Dale-2
On Tue, May 06, 2008 at 10:20:55AM -0700, Cameron Dale wrote:
>
> 2) there's no guarantee that a file is unchanged when apt is run again

Currently apt-get just uses the modification timestamp of the
packages file to build the "If-Modified-Since" query.
apt-get relies on an unmodified /var/lib/apt/lists/


--
Stefan Tichy   ( dlist at pi4tel dot de )


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: apt-get may accept inconsistent data

Goswin von Brederlow-2
In reply to this post by Cameron Dale-2
"Cameron Dale" <[hidden email]> writes:

> On 5/7/08, Goswin von Brederlow <[hidden email]> wrote:
>> "Cameron Dale" <[hidden email]> writes:
>>  > 3) getting an HTTP 304 response may be faster than hashing a 20 MB
>>  > file, especially considering that a request may need to be sent after
>>  > finding an out of date hash
>>
>> It may be faster but not authorative. Also on 99.9% of all systems the
>>  time to checksum 20MB is neglible. And on others it is probably
>>  insignificant compared to a following apt-get upgrade call.
>
> It should be authoritative, the only reason it's not would be a broken
> proxy, which isn't really apt's or the mirror's fault.

Or the timestamp on the mirror is wrong, on any mirror along the
mirror path. Or there is a man in the middle attack going on.

Security wise the http can not be trusted.

MfG
        Goswin


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]