Bug#930365: CUDA 10.1 Update 1 is now available

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Graham Inggs-3
Source: nvidia-cuda-toolkit
Version: 9.2.148-6
Severity: wishlist

Hi Maintainers

CUDA 10.1 Update 1 (10.1.168) was released at the end of May, 2019.  The
minimum NVIDIA driver version remains at 418.39 and support is added for
Clang 8.

As per the release notes [1]:
"CUDA 10.1 Update 1 is a minor update that is binary compatible with
CUDA 10.1. This release will work with all versions of the R418 NVIDIA
driver."

Regards
Graham


[1]
https://docs.nvidia.com/cuda/archive/10.1/cuda-toolkit-release-notes/index.html

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Andreas Beckmann-4
On 11/06/2019 15.36, Graham Inggs wrote:
> CUDA 10.1 Update 1 (10.1.168) was released at the end of May, 2019.  The
> minimum NVIDIA driver version remains at 418.39 and support is added for
> Clang 8.

Nice. I'll look into this next month (after buster was released).


Andreas

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Graham Inggs-3
Hi all

On 2019/06/12 14:23, Andreas Beckmann wrote:
> Nice. I'll look into this next month (after buster was released).

I was keen to try the new version to see if it solved a problem with
p2pBandwidthLatencyTest locking up on a machine with two TITAN X GPUs,
so I went ahead and pushed a 10.1update1-ginggs branch [1] to git.  The
problem ended up being with the 418 driver, and upgrading to 430 seems
to have solved it.

As of now, my packaging attempt builds on amd64 and ppc64el, and I'll be
running some tests on it next week.

There are still some Lintian warnings and errors, and I'd like some
suggestions on which should be fixed and which can be overridden, please.

E: nsight-systems: binary-or-shlib-defines-rpath
usr/lib/nsight-systems/Host-x86_64/libicudata.so.56 /home/qt/icu_install/lib
W: nsight-systems: shared-lib-without-dependency-information
usr/lib/nsight-systems/Host-x86_64/libicudata.so.56
E: nsight-systems: binary-or-shlib-defines-rpath
usr/lib/nsight-systems/Host-x86_64/libicui18n.so.56 /home/qt/icu_install/lib
E: nsight-systems: binary-or-shlib-defines-rpath
usr/lib/nsight-systems/Host-x86_64/libicuuc.so.56 /home/qt/icu_install/lib
W: nsight-systems: binary-without-manpage usr/bin/nsight-sys

W: nvidia-nsight: jar-not-in-usr-share
usr/lib/nvidia-nsight/configuration/org.eclipse.osgi/201/data/1519573780/content.jar
W: nvidia-nsight: jar-not-in-usr-share
usr/lib/nvidia-nsight/configuration/org.eclipse.osgi/210/data/listener_1925729951/artifacts.jar
W: nvidia-nsight: jar-not-in-usr-share
usr/lib/nvidia-nsight/configuration/org.eclipse.osgi/210/data/listener_1925729951/content.jar
W: nvidia-nsight: jar-not-in-usr-share
usr/lib/nvidia-nsight/configuration/org.eclipse.osgi/74/0/.cp/lib/Tidy.jar
W: nvidia-nsight: jar-not-in-usr-share
usr/lib/nvidia-nsight/configuration/org.eclipse.osgi/74/0/.cp/lib/commons-cli-1.0.jar

W: nsight-compute: binary-without-manpage usr/bin/nv-nsight-cu
W: nsight-compute: binary-without-manpage usr/bin/nv-nsight-cu-cli

W: nvidia-cuda-toolkit: binary-without-manpage usr/bin/bin2c
W: nvidia-cuda-toolkit: binary-without-manpage usr/bin/cudafe++
W: nvidia-cuda-toolkit: binary-without-manpage usr/bin/fatbinary
W: nvidia-cuda-toolkit: binary-without-manpage usr/bin/gpu-library-advisor
W: nvidia-cuda-toolkit: binary-without-manpage usr/bin/nvlink
W: nvidia-cuda-toolkit: binary-without-manpage usr/bin/ptxas

Regards
Graham


[1]
https://salsa.debian.org/nvidia-team/nvidia-cuda-toolkit/tree/10.1update1-ginggs

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Andreas Beckmann-4
On 14/06/2019 17.36, Graham Inggs wrote:
> Hi all
>
> On 2019/06/12 14:23, Andreas Beckmann wrote:
>> Nice. I'll look into this next month (after buster was released).

I think I'll upload 10.1 to sid first, do the transition there (do you
remember any problems from Ubuntu?) and look at 10.1u1 afterwards (is
there anything needing NEW?).
That way we should quickly have 10.1 in buster-backports.

From looking at the diff some time ago I remember you added a new
variable for the driver version. Why?

Andreas

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Graham Inggs-3
Hi Andreas

On Sun, 7 Jul 2019 at 11:16, Andreas Beckmann <[hidden email]> wrote:
> I think I'll upload 10.1 to sid first, do the transition there (do you
> remember any problems from Ubuntu?)

No problems; most were no-change rebuilds (aka binNMUs), except
caffe-contrib [1] and starpu-contrib [2] where I switched from GCC 7
to GCC 8 as well.

> and look at 10.1u1 afterwards (is
> there anything needing NEW?).

10.1 -> 10.1u1 does not need to go through NEW \o/

> From looking at the diff some time ago I remember you added a new
> variable for the driver version. Why?

The minimum driver version and the version included in the download
filename are now different [3].

We've been using 10.1u1 extensively on a machine with 2 x TITAN X
cards for the past three weeks without problems.

Regards
Graham


[1] https://launchpad.net/ubuntu/+source/caffe-contrib/1.0.0+git20180821.99bd997-2ubuntu1
[2] https://launchpad.net/ubuntu/+source/starpu-contrib/1.2.6+dfsg-6ubuntu1
[3] https://salsa.debian.org/nvidia-team/nvidia-cuda-toolkit/commit/76d284740b9492e2663ac8e00656fbb2e3a79820#c59424ec57b7338d3d2365e0adc073efc3baf91c_12_12

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Andreas Beckmann-4
On 07/07/2019 11.38, Graham Inggs wrote:
>> From looking at the diff some time ago I remember you added a new
>> variable for the driver version. Why?
>
> The minimum driver version and the version included in the download
> filename are now different [3].

Wouldn't it be sufficient if we relax the driver version in the
dependencies to the mayor driver version (no digits) instead of tracking
two versions?
Thay may permit to use certain outdated driver beta versions (that were
probably never uploaded to Debian/non-free), a combination not supported
upstream, but which will probably still work.


Andreas

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Graham Inggs-3
On Sun, 7 Jul 2019 at 12:26, Andreas Beckmann <[hidden email]> wrote:
> Wouldn't it be sufficient if we relax the driver version in the
> dependencies to the mayor driver version (no digits) instead of tracking
> two versions?

I don't know.  From the bits I've read, it sounds like Nvidia want to
release the CDUA toolkit more frequently, and improve compatibility.
So it wouldn't surprise me, if in the next update, the minimum version
remained 418.39 , but included a 430 series driver.

I haven't got my brain around the CUDA Compatiblity section yet, but
the CUDA Application Compatibility Support Matrix [1] seems to
indicate drivers >= 396.26 and >= 384.111 will be compatible with CUDA
10.1.  So perhaps we will need to reconsider how the dependencies in
our packages are expressed.


[1] https://docs.nvidia.com/deploy/cuda-compatibility/index.html#cuda-application-compatibility__table-cuda-application-support-matrix

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Andreas Beckmann-4
In reply to this post by Graham Inggs-3
On 14/06/2019 17.36, Graham Inggs wrote:
> As of now, my packaging attempt builds on amd64 and ppc64el, and I'll be
> running some tests on it next week.

I've now uploaded it to experimental.
Please check whether all packages B-D on n-c-t build with 10.1U1 - we
recently had a spurious FTBFS on ppc64el while testing the 10.1 transition.


Andreas

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Graham Inggs-3
Hi Andreas

On Sat, 17 Aug 2019 at 22:37, Andreas Beckmann <[hidden email]> wrote:
> I've now uploaded it to experimental.
> Please check whether all packages B-D on n-c-t build with 10.1U1 - we
> recently had a spurious FTBFS on ppc64el while testing the 10.1 transition.

Thanks for the upload!

I sync'd 10.1.168-1 into Ubuntu and autopkgtests of caffe-contrib,
libgpuarray and starpu-contrib were successful.  The package has
already migrated.

I then did test rebuilds of caffe-contrib, eztrace-contrib,
gr-fosphor, hwloc-contrib, nvtop, pycuda and starpu-contrib in an
Ubuntu PPA.  All were successful except nvtop on ppc64el (where it has
never built before) and caffe-contrib on ppc64el.

The caffe-contrib build on ppc64el emitted the warning and errors below:

In file included from
/<<BUILDDIR>>/caffe-contrib-1.0.0+git20180821.99bd997/src/caffe/util/math_functions.cu:1:
/usr/include/math_functions.h:54:2: warning: #warning
"math_functions.h is an internal header file and must not be used
directly.  This file will be removed in a future CUDA release.  Please
use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp]
 #warning "math_functions.h is an internal header file and must not be
used directly.  This file will be removed in a future CUDA release.
Please use cuda_runtime_api.h or cuda_runtime.h instead."
  ^~~~~~~
/usr/include/boost/config/detail/suffix.hpp(510): error: identifier
"__ieee128" is undefined

/usr/include/boost/config/detail/suffix.hpp(510): error: identifier
"__ieee128" is undefined

/usr/include/boost/config/detail/suffix.hpp(510): error: identifier
"__ieee128" is undefined

The same warning does appear in the amd64 build as well.  I'll try to
reproduce on plummer.debian.org.

Regards
Graham

Reply | Threaded
Open this post in threaded view
|

Bug#930365: CUDA 10.1 Update 1 is now available

Graham Inggs-3
On 2019/08/20 09:57, Graham Inggs wrote:
> The same warning does appear in the amd64 build as well.  I'll try to
> reproduce on plummer.debian.org.

I tried, but couldn't figure out how to enable contrib or non-free on a
porterbox.  Is it even possible?

Anyway, it seems that caffe-contrib has never been built on ppc64el in
Debian, so this shouldn't be a blocker.