Bug#928497: nvidia-persistenced: Error in nvidia-persistenced source (postinst)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug#928497: nvidia-persistenced: Error in nvidia-persistenced source (postinst)

Marcelo "Elppans" Klumpp
Package: nvidia-persistenced
Version: 418.56
Severity: important

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
     ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these template lines ***


-- System Information:
Debian Release: 9.9
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.0-9-amd64 (SMP w/1 CPU core)
Locale: LANG=pt_BR.UTF-8, LC_CTYPE=pt_BR.UTF-8 (charmap=UTF-8), LANGUAGE=pt_BR:pt:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages nvidia-persistenced depends on:
ii  libc6                                2.24-11+deb9u4
pn  libnvidia-cfg1 | libnvidia-cfg1-any  <none>

nvidia-persistenced recommends no packages.

nvidia-persistenced suggests no packages.


Hello...
There is an ambiguous configuration error in the file "debian/nvidia-persistenced.postinst":

This code:

#!/bin/sh
set -e

case "$1" in
  'configure')
     if ! getent passwd nvpd >/dev/null; then
       # Create ad-hoc system user/group
       adduser --system --group \
               --home /var/run/nvpd/ \
               --gecos 'nVidia Persistence daemon' \
               --no-create-home \
               nvpd
     fi
     ;;
esac

#DEBHELPER#

It means that you should create a "home" folder in /var/run/nvpd/, but at the end of the code it says DO NOT CREATE the folder.

If you want to create a folder, how can you NOT CREATE the folder you are creating?

Because of this, when installing the package, the following error occurs and the service does not start:

Alert: The home directory /var/run/nvpd/ you specified can not be accessed: No such file or directory

Alert: The home directory /var/run/nvpd/ you specified can not be accessed: No such file or directory
Adding system user 'nvpd' (UID 118) ...
Adding new group 'nvpd' (GID 124) ...
Adding new user 'nvpd' (UID 118) with group 'nvpd' ...
Not creating personal directory '/var/run/nvpd/'.
Created symlink /etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service → /lib/systemd/system/nvidia-persistenced.service.
Job for nvidia-persistenced.service failed because the control process exited with error code.
See "systemctl status nvidia-persistenced.service" and "journalctl -xe" for details.

To correct, you must modify the code to:

#!/bin/sh
set -e

if [ "$1" = "configure" ]; then
     if ! getent passwd nvpd >/dev/null; then
       # Create ad-hoc system user/group
       adduser --system --group \
               --home /var/run/nvpd/ \
               --gecos 'NVIDIA Persistence Daemon' \
               --no-create-home \
               nvpd
     fi
fi

#DEBHELPER#
Reply | Threaded
Open this post in threaded view
|

Bug#928497: nvidia-persistenced: Error in nvidia-persistenced source (postinst)

Andreas Beckmann-4
Control: tag -1 moreinfo

On 2019-05-06 08:01, Marcelo "Elppans" Klumpp wrote:
> There is an ambiguous configuration error in the file "debian/nvidia-persistenced.postinst":

> It means that you should create a "home" folder in /var/run/nvpd/, but at the end of the code it says DO NOT CREATE the folder.
>
> If you want to create a folder, how can you NOT CREATE the folder you are creating?

/var/run is a volatile path, anything needed there has to be created at runtime, not during installation.
And the home directory of nvpd is clearly *not* needed.

> Because of this, when installing the package, the following error occurs and the service does not start:
>
> Alert: The home directory /var/run/nvpd/ you specified can not be accessed: No such file or directory
>
> Alert: The home directory /var/run/nvpd/ you specified can not be accessed: No such file or directory

This is just noisy.

> Adding system user 'nvpd' (UID 118) ...
> Adding new group 'nvpd' (GID 124) ...
> Adding new user 'nvpd' (UID 118) with group 'nvpd' ...
> Not creating personal directory '/var/run/nvpd/'.
> Created symlink /etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service → /lib/systemd/system/nvidia-persistenced.service.

> Job for nvidia-persistenced.service failed because the control process exited with error code.
> See "systemctl status nvidia-persistenced.service" and "journalctl -xe" for details.

What is the actual error encountered while starting the service?

I can reproduce this (on a systemv system) with a (deliberately)
mismatching kernel module loaded:

Preparing to unpack .../nvidia-persistenced_418.56-1_amd64.deb ...
Unpacking nvidia-persistenced (418.56-1) over (390.87-1~deb9u1) ...
Setting up nvidia-persistenced (418.56-1) ...
Warning: The home dir /var/run/nvpd/ you specified can't be accessed: No such file or directory
Adding system user `nvpd' (UID 133) ...
Adding new group `nvpd' (GID 144) ...
Adding new user `nvpd' (UID 133) with group `nvpd' ...
Not creating home directory `/var/run/nvpd/'.
Stopping NVIDIA Persistence Daemon
Starting NVIDIA Persistence Daemon
nvidia-persistenced failed to initialize. Check syslog for more details.
invoke-rc.d: initscript nvidia-persistenced, action "restart" failed.
dpkg: error processing package nvidia-persistenced (--configure):
 installed nvidia-persistenced package post-installation script subprocess returned error exit status 1

in syslog I have:

nvidia-persistenced: Started (7837)
nvidia-persistenced: Failed to query NVIDIA devices. Please ensure that the NVIDIA device files (/dev/nvidia*) exist, and that user 133 has read and write permissions for those files.
nvidia-persistenced: Shutdown (7837)
kernel: [12558015.162992] NVRM: API mismatch: the client has the version 410.104, but
kernel: [12558015.162992] NVRM: this kernel module has the version 396.54.  Please
kernel: [12558015.162992] NVRM: make sure that this kernel module and all NVIDIA driver
kernel: [12558015.162992] NVRM: components have the same version.

With the correct kernel module loaded, nvidia-persistenced runs fine.


Andreas

Reply | Threaded
Open this post in threaded view
|

Bug#928497: nvidia-persistenced: Error in nvidia-persistenced source (postinst)

Daniel Reichelt-3
In reply to this post by Marcelo "Elppans" Klumpp
Hi Andreas,


> I can reproduce this (on a systemv system) with a (deliberately)
> mismatching kernel module loaded:

this also happens on systems w/o nvidia hardware, i.e. when the module
isn't loaded at all. That situation produces similar syslog entries than
yours, except the messages prefixed NVRM are missing, of course.

------8<----------
May  7 23:37:08 testhost nvidia-persistenced: Started (19591)
May  7 23:37:08 testhost nvidia-persistenced: Failed to query NVIDIA
devices. Please ensure that the NVIDIA device files (/dev/nvidia*)
exist, and that user 145 has read and write permissions for those files.
May  7 23:37:08 testhost nvidia-persistenced: Shutdown (19591)
May  7 23:37:58 testhost nvidia-persistenced: Started (19744)
May  7 23:37:58 testhost nvidia-persistenced: Failed to query NVIDIA
devices. Please ensure that the NVIDIA device files (/dev/nvidia*)
exist, and that user 145 has read and write permissions for those files.
May  7 23:37:58 testhost nvidia-persistenced: Shutdown (19744)
May  7 23:38:04 testhost nvidia-persistenced: Started (19823)
May  7 23:38:04 testhost nvidia-persistenced: Failed to query NVIDIA
devices. Please ensure that the NVIDIA device files (/dev/nvidia*)
exist, and that user 145 has read and write permissions for those files.
May  7 23:38:04 testhost nvidia-persistenced: Shutdown (19823)
------>8----------


Cheers
Daniel


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Bug#928497: Acknowledgement (nvidia-persistenced: Error in nvidia-persistenced source (postinst))

Marcelo "Elppans" Klumpp
In reply to this post by Marcelo "Elppans" Klumpp
Using the Debian Stretch plus SID Source repository and satisfying all the dependencies required for a build, everything happens perfectly.
The installation of the packages after the compilation is perfect, the modules work correctly.
However, in my first compilation of the nvidia-persistenced package without any modification, I had an error that causes the "nvidia-persistenced.service" service to not work.

Compilation method:

sudo apt-get -b source nvidia-driver nvidia-settings nvidia-modprobe nvidia-settings nvidia-support glx-alternative-nvidia init-system-helpers nvidia-persistenced

Installation response via apt (I've created a local repository to facilitate):

Note: I tested the installation on a VM, since I already have a distro with the functional driver

...
Configuring nvidia-persistenced (418.56-1) ...
Alert: The home directory /var/run/nvpd/ you specified can not be accessed: No such file or directory
Adding system user 'nvpd' (UID 115) ...
Adding new group 'nvpd' (GID 118) ...
Adding new user 'nvpd' (UID 115) with group 'nvpd' ...
Not creating personal directory '/var/run/nvpd/'.
Created symlink /etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service → /lib/systemd/system/nvidia-persistenced.service.
Job for nvidia-persistenced.service failed because the control process exited with error code.
See "systemctl status nvidia-persistenced.service" and "journalctl -xe" for details.
...

Checking service status:


stretch@debian:/usr/src/nv$ systemctl status nvidia-persistenced.service
● nvidia-persistenced.service - NVIDIA Persistence Daemon
   Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2019-05-08 01:59:51 -03; 7min ago

Verifying further details:

stretch@debian:/usr/src/nv$ sudo journalctl -xe
-- Defined-By: systemd
-- 
-- The nvidia-persistenced.service drive is starting.
mai 08 01:59:51 debian nvidia-persistenced[19166]: Started (19166)
mai 08 01:59:51 debian nvidia-persistenced[19166]: Failed to open libnvidia-cfg.so.1: libnvidia-cfg.so.1: cannot open shared object file: No such file or directory
mai 08 01:59:51 debian nvidia-persistenced[19165]: nvidia-persistenced failed to initialize. Check syslog for more details.
mai 08 01:59:51 debian nvidia-persistenced[19166]: Shutdown (19166)
mai 08 01:59:51 debian systemd[1]: nvidia-persistenced.service: Control process exited, code=exited status=1
mai 08 01:59:51 debian systemd[1]: Failed to start NVIDIA Persistence Daemon.
-- Subject: A unidade nvidia-persistenced.service falhou
-- Defined-By: systemd
-- 
-- The nvidia-persistenced.service unit failed.
-- 
-- The result is failed.
mai 08 01:59:51 debian systemd[1]: nvidia-persistenced.service: Unit entered failed state.
mai 08 01:59:51 debian systemd[1]: nvidia-persistenced.service: Failed with result 'exit-code'.
mai 08 02:00:24 debian systemd[1]: Reloading.
mai 08 02:00:24 debian systemd[1]: apt-daily-upgrade.timer: Adding 58min 40.128838s random time.
mai 08 02:00:24 debian systemd[1]: anacron.timer: Adding 4min 57.391958s random time.
mai 08 02:00:24 debian systemd[1]: apt-daily.timer: Adding 1h 8min 57.545202s random time.
mai 08 02:00:34 debian sudo[17446]: pam_unix(sudo:session): session closed for user root
mai 08 02:00:37 debian PackageKit[860]: get-updates transaction /228_becdbacd from uid 1000 finished with success after 916ms
mai 08 02:00:38 debian PackageKit[860]: get-updates transaction /229_eaccaccc from uid 1000 finished with success after 913ms
mai 08 02:01:29 debian sshd[1441]: pam_unix(sshd:session): session closed for user stretch
mai 08 02:01:29 debian systemd-logind[351]: Removed session 5.
-- Subject: Session 5 has been terminated
-- Defined-By: systemd

However, when checking if the package is installed:

stretch@debian:~$ dpkg -l libnvidia-cfg1* | grep ^ii
ii  libnvidia-cfg1:amd64 418.56-2     amd64        NVIDIA binary OpenGL/GLX configuration library

Recompiling manually by removing the line containing the "--no-create-home \" code from the files:

rm -rf nvidia-persistenced*.deb
cd `ls -d */ | grep nvidia-persistenced`
sed -i "/--no-create-home/d" `find . -type f -a -exec grep -l "\-\-no-create-home"  '{}' \;`
cd -
dpkg-scanpackages -m -t deb . | sudo gzip -c > "$BDIR"Packages.gz
sudo apt-get update
sudo apt-get install --reinstall nvidia-persistenced

...
Configuring nvidia-persistenced (418.56-1) ...
Job for nvidia-persistenced.service failed because the control process exited with error code.
See "systemctl status nvidia-persistenced.service" and "journalctl -xe" for details.
...

Note 2: This error that occurred was due to the VM because you do not have the NVidia card in it, but note that the following error disappears:

Alert: The home directory /var/run/nvpd/ you specified can not be accessed: No such file or directory
Not creating personal directory '/var/run/nvpd/'.

The NVidia driver compiled and installed on my actual machine contains the following information:

dpkg -l nvidia-persistenced* | grep ^ii
ii  nvidia-persistenced 418.56-1     amd64        daemon to maintain persistent software state in the NVIDIA driver

systemctl status nvidia-persistenced.service
● nvidia-persistenced.service - NVIDIA Persistence Daemon
   Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-05-08 06:18:48 -03; 2h 54min left
  Process: 4967 ExecStart=/usr/bin/nvidia-persistenced --user nvpd (code=exited, status=0/SUCCESS)
 Main PID: 4974 (nvidia-persiste)
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/nvidia-persistenced.service
           └─4974 /usr/bin/nvidia-persistenced --user nvpd

sudo systemctl restart nvidia-persistenced.service
sudo journalctl -xe
-- Subject: Unidade nvidia-persistenced.service being turned off
-- Defined-By: systemd
-- 
-- The nvidia-persistenced.service unit  is being turned off.
mai 08 03:28:57 DarkElven systemd[1]: Stopped NVIDIA Persistence Daemon.
-- Subject: A unidade nvidia-persistenced.service completed the shutdown
-- Defined-By: systemd
-- 
-- The nvidia-persistenced.service unit  completed the shutdown
mai 08 03:28:57 DarkElven systemd[1]: Starting NVIDIA Persistence Daemon...
-- Subject: Unidade nvidia-persistenced.service being initiated
-- Defined-By: systemd
-- 
-- The nvidia-persistenced.service unit  is being started.
mai 08 03:28:57 DarkElven nvidia-persistenced[9105]: Started (9105)
mai 08 03:28:57 DarkElven systemd[1]: Started NVIDIA Persistence Daemon.
-- Subject: Unidade nvidia-persistenced.service completed the startup
-- Defined-By: systemd
-- 
-- The nvidia-persistenced.service unit completed the startup


While testing the builds, I noticed that the repositories were updated on "2019-05-06 23:23". The compilation I made, where I specified the missing "/var/run/nvpd/" folder was made in "2019-05-05" and today when I did a new build without modifying anything, it worked correctly.

I noticed that what I compiled for the first time, "nvidia-persistenced.service" was specifying the folder in "ExecStart", so the error occurred. Now it is not specified:

[Unit]
Description=NVIDIA Persistence Daemon
Wants=syslog.target

[Service]
Type=forking
ExecStart=/usr/bin/nvidia-persistenced --user nvpd
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced

[Install]
WantedBy=multi-user.target

Now the compiled package is working normally, which means that the error I mentioned is already fixed. However, you should be very careful when choosing which packages to use for it so that it is as stable as possible.

Em seg, 6 de mai de 2019 às 03:03, Debian Bug Tracking System <[hidden email]> escreveu:
Thank you for filing a new Bug report with Debian.

You can follow progress on this Bug here: 928497: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928497.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

As you requested using X-Debbugs-CC, your message was also forwarded to
  [hidden email]
(after having been given a Bug report number, if it did not have one).

Your message has been sent to the package maintainer(s):
 Debian NVIDIA Maintainers <[hidden email]>

If you wish to submit further information on this problem, please
send it to [hidden email].

Please do not send mail to [hidden email] unless you wish
to report a problem with the Bug-tracking system.

--
928497: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928497
Debian Bug Tracking System
Contact [hidden email] with problems


--
Att.:
Marcelo Klumpp
Analista de Suporte - GNU/Linux