Bug#404143: Fans unreliable under load, permanent memory leak

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Bug#404143: Fans unreliable under load, permanent memory leak

Maximilian Attems-3
hello,

On Fri, 22 Dec 2006, Marc 'HE' Brockschmidt wrote:

> [hidden email] writes:
> > I'm more than willing to help test a kernel package, but I'll be on
> > [VAC] from 2006-12-23 to 2007-01-03 inclusive.  So, please do not
> > release Etch just now :)
>
> I have ordered an nx6325, which should arrive directly after
> Christmas. I would also be happy to test a fixed kernel. Due to this
> being an overheating problem, I would prefer if you could provide kernel
> images, so that I don't have to compile it.
>
> Marc
> --
> BOFH #34:
> (l)user error

could you please send in the output of:
dmidecode
acpidump

thanks

--
maks


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Bug#404143: Fans unreliable under load, permanent memory leak

Matthew Garrett
In reply to this post by Maximilian Attems-3
A couple of observations:

* This bug will not cause hardware damage. The hard thermal cutoff
temperature is well below the temperature at which actual damage will
occur.

* It's not clear that the vendor DSDT is broken. It's an unusual
interpretation of the spec, but not necessarily an invalid one - sadly,
the ACPI specification is not entirely clear on every point.

The patch is /probably/ safe, and we've been shipping it in Ubuntu. On
the other hand, previous versions did cause problems on certain other
items of hardware. It's not clear what the best option is, but it's
certainly not a regression over Sarge.

--
Matthew Garrett | [hidden email]


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Bug#404891: marked as done (Fans unreliable under load, permanent memory leak)

Debian Bug Tracking System
In reply to this post by Ludovic Brenta-2
Your message dated Sun, 18 Mar 2007 19:50:57 +0100
with message-id <[hidden email]>
and subject line Patch committed to CVS, bug fixed
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)


Package: linux-image-2.6.18-3-amd64
Version: 2.6.18-7
Severity: grave
Justification: hardware overheating hazard; requires periodic reboots

(This is not the same bug as #400488 (upstream #7122))

This bug affects several amd64 notebooks from HP, notably the nx6125
and the nx6325; there may be other affected machines as well.

Kernel team, please apply the patches for
http://bugzilla.kernel.org/show_bug.cgi?id=5534

This bug is there merely to remind the kernel team not to release etch
without the patches :) However I'm not sure which upstream version of
linux, if any, contains the patches in the (long) trail of comments.
So, it might be necessary to wait for a few days until the patches
arrive in Linus' tree.

Symptoms:
- under load, the fans fail to turn on when the temperature reaches
  and then exceeds the normal threshold, which is 58°C.
- there is a permanent memory leak in the kernel, even when the system
  is idle.  The leak is visible by looking at
  $ grep Slab: /proc/meminfo         and
  $ grep Acpi-State /proc/slabinfo

Workaround:
- if overheating, shut down the computer and let it cool down; or
  let it shut itself down to prevent a fire hazard.
- if the only problem is the memory leak, reboot.

Consequence: linux-image-2.6.18-3-amd63 (=2.6.18-7) is unsuitable for
release.

The memory leak is described at:

http://www.mail-archive.com/linux-acpi@.../msg03119.html

Today I had to reboot my HP Compaq nx6325 because the kernel was
eating 1.8 Gb out of the 1.9 Gb of RAM in the system, after about 9
days of uptime.  Then I started a hourly cron job to monitor
/proc/meminfo and /proc/slabinfo as described above:

2006-06-21T20:06:10: Slab:            30296 kB
2006-17-21T20:17:01: Slab:            37756 kB
2006-17-21T21:17:01: Slab:            48116 kB
2006-17-21T22:17:01: Slab:            55764 kB
2006-17-21T23:17:01: Slab:            69904 kB
-- Reboot with acpi=noirq: only one CPU found --
2006-24-21T23:24:10: Slab:            10444 kB
-- Reboot with pci=noacpi: only one CPU found --
2006-30-21T23:30:26: Slab:             9676 kB
2006-30-21T23:30:26: Acpi-State             0      0     80   48    1 : tunables  120   60    8 : slabdata      0      0      0
-- Reboot with no options: OK, both CPUs found --
2006-34-21T23:34:23: Slab:            10584 kB
2006-34-21T23:34:23: Acpi-State             0      0     80   48    1 : tunables  120   60    8 : slabdata      0      0      0
2006-17-22T00:17:01: Slab:            15424 kB
2006-17-22T00:17:01: Acpi-State         23088  23088     80   48    1 : tunables  120   60    8 : slabdata    481    481      0
2006-17-22T01:17:01: Slab:            29956 kB
2006-17-22T01:17:01: Acpi-State         59136  59136     80   48    1 : tunables  120   60    8 : slabdata   1232   1232      0

I'm more than willing to help test a kernel package, but I'll be on
[VAC] from 2006-12-23 to 2007-01-03 inclusive.  So, please do not
release Etch just now :)

--
Ludovic Brenta.




I've commited Steve's fix to this bug which includes the description of the
issue. Furthermore, I've updated his patch to reflect some additional models
(mentioned in the kernel's bugzilla but not in Debian's) and to point also to
the other ACPI issue (Kernel's #7122 and Debian's #400488)

Regards

Javier

signature.asc (196 bytes) Download Attachment
123