h5py and hdf5-mpi

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

h5py and hdf5-mpi

Drew Parsons
We need to change h5py to support hdf5-mpi.  h5py is somewhat crippled
as serial-only.

We could just do it straight away in python3-h5py.  Is there much point
having h5py support both hdf5-serial and hdf5-mpi?  Perhaps there is, in
which case we need to set up multiple builds and use alternatives to set
the preferred h5py.

A related question, is there much point setting up support for
hdf5-mpich as well as hdf5-openmpi?  Increasing build and
package-alternatives complexity, but once it's done once to distinguish
hdf5-serial from hdf5-mpi, it's not that much more work to also split
hdf5-mpi between hdf5-mpich and hdf5-openmpi.

Drew

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Mo Zhou
Hi Drew,

thanks for the commits to h5py.

On 2019-08-12 03:10, Drew Parsons wrote:
> We need to change h5py to support hdf5-mpi.  h5py is somewhat crippled
> as serial-only.

I didn't even notice that since my use case for hdf5 is light-weight.
(training data is fully loaded from hdf5 into memory)

> We could just do it straight away in python3-h5py.  Is there much
> point having h5py support both hdf5-serial and hdf5-mpi?  Perhaps
> there is, in which case we need to set up multiple builds and use
> alternatives to set the preferred h5py.

In fact I don't know. Maybe @ghisvail could answer?

> A related question, is there much point setting up support for
> hdf5-mpich as well as hdf5-openmpi?  Increasing build and
> package-alternatives complexity, but once it's done once to
> distinguish hdf5-serial from hdf5-mpi, it's not that much more work to
> also split hdf5-mpi between hdf5-mpich and hdf5-openmpi.

My personal opinion is to just choose a reasonable default,
unless users shouted for that.

Compiling every possible configuration will eventually make
the science team maintenance burden notorious. h5py is not
like BLAS64/BLAS-flavours which are clearly needed by some
portion of scientific users.

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

ghisvail
Le lun. 12 août 2019 à 17:04, Mo Zhou <[hidden email]> a écrit :

>
> Hi Drew,
>
> thanks for the commits to h5py.
>
> On 2019-08-12 03:10, Drew Parsons wrote:
> > We need to change h5py to support hdf5-mpi.  h5py is somewhat crippled
> > as serial-only.
>
> I didn't even notice that since my use case for hdf5 is light-weight.
> (training data is fully loaded from hdf5 into memory)

Same here. My use case for h5py is for storing medical images and raw
data, all of which usually fit into a single workstation.

> > We could just do it straight away in python3-h5py.  Is there much
> > point having h5py support both hdf5-serial and hdf5-mpi?  Perhaps
> > there is, in which case we need to set up multiple builds and use
> > alternatives to set the preferred h5py.
>
> In fact I don't know. Maybe @ghisvail could answer?

I can't answer this question since I have never used the parallel
builds of HDF5 and h5py.

Are we really sure alternatives are appropriate for this particular use case?

Python has got other means for injecting alternative dependencies such
as PYTHONPATH and virtualenvs.

> > A related question, is there much point setting up support for
> > hdf5-mpich as well as hdf5-openmpi?  Increasing build and
> > package-alternatives complexity, but once it's done once to
> > distinguish hdf5-serial from hdf5-mpi, it's not that much more work to
> > also split hdf5-mpi between hdf5-mpich and hdf5-openmpi.
>
> My personal opinion is to just choose a reasonable default,
> unless users shouted for that.

Same here.

We can't catter to every use case in the scientific community, so the
best we can do is choose something sensible with the data point we
have got (if any) and later reconsider with users feedback.

> Compiling every possible configuration will eventually make
> the science team maintenance burden notorious. h5py is not
> like BLAS64/BLAS-flavours which are clearly needed by some
> portion of scientific users.

There is also the question of long-term maintainability. For HPC
builds, people will build their stack from source anyway for maximum
performance on their dedicated hardware. That was the case back when I
used to work for a university. I don't think targeting these users is
worth the trouble compared to research staff who want to prototype or
deploy something quickly on their respective workstation or laptop
where resources are more constrained. That's the background I am
coming from personally, hence why MPI never was considered at the
time.

Your mileage may vary of course, and I welcome (and value) your opinions.

Please let me know.

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Steffen Möller
Hello,

On 12.08.19 18:15, Ghislain Vaillant wrote:

> Le lun. 12 août 2019 à 17:04, Mo Zhou <[hidden email]> a écrit :
>> Hi Drew,
>>
>> thanks for the commits to h5py.
>>
>> On 2019-08-12 03:10, Drew Parsons wrote:
>>> We need to change h5py to support hdf5-mpi.  h5py is somewhat crippled
>>> as serial-only.
>> I didn't even notice that since my use case for hdf5 is light-weight.
>> (training data is fully loaded from hdf5 into memory)
> Same here. My use case for h5py is for storing medical images and raw
> data, all of which usually fit into a single workstation.
>
>>> We could just do it straight away in python3-h5py.  Is there much
>>> point having h5py support both hdf5-serial and hdf5-mpi?  Perhaps
>>> there is, in which case we need to set up multiple builds and use
>>> alternatives to set the preferred h5py.
>> In fact I don't know. Maybe @ghisvail could answer?
> I can't answer this question since I have never used the parallel
> builds of HDF5 and h5py.
>
> Are we really sure alternatives are appropriate for this particular use case?
>
> Python has got other means for injecting alternative dependencies such
> as PYTHONPATH and virtualenvs.
>
>>> A related question, is there much point setting up support for
>>> hdf5-mpich as well as hdf5-openmpi?  Increasing build and
>>> package-alternatives complexity, but once it's done once to
>>> distinguish hdf5-serial from hdf5-mpi, it's not that much more work to
>>> also split hdf5-mpi between hdf5-mpich and hdf5-openmpi.
>> My personal opinion is to just choose a reasonable default,
>> unless users shouted for that.
> Same here.
>
> We can't catter to every use case in the scientific community, so the
> best we can do is choose something sensible with the data point we
> have got (if any) and later reconsider with users feedback.
>
>> Compiling every possible configuration will eventually make
>> the science team maintenance burden notorious. h5py is not
>> like BLAS64/BLAS-flavours which are clearly needed by some
>> portion of scientific users.
> There is also the question of long-term maintainability. For HPC
> builds, people will build their stack from source anyway for maximum
> performance on their dedicated hardware. That was the case back when I
> used to work for a university. I don't think targeting these users is
> worth the trouble compared to research staff who want to prototype or
> deploy something quickly on their respective workstation or laptop
> where resources are more constrained. That's the background I am
> coming from personally, hence why MPI never was considered at the
> time.
>
> Your mileage may vary of course, and I welcome (and value) your opinions.
>
> Please let me know.

There are a few data formats in bioinformatics now depending on hdf5 and
h5py is used a lot. My main concern is that the user should not need to
configure anything, like a set of hostnames. And there should not be
anything stalling since it waiting for contacting a server. MPI needs to
be completely transparent and then I would very much like to see it.

For packaging I would prefer it all to be as simple as possible, so not
dragging in MPI would be nice, i.e. I would like to see the -serial
package that provides hdf5. As long as the two different flavours of MPI
cannot be used in mixed setups, I suggest to have hdf5-openmpi and also
hdf5-mpich if you still have the energy left.

How do autotests work for MPI?

Cheers,

Steffen

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons
In reply to this post by Drew Parsons
On 2019-08-13 00:15, Ghislain Vaillant wrote:

> Le lun. 12 août 2019 à 17:04, Mo Zhou <[hidden email]> a écrit :
>>
>>
>> On 2019-08-12 03:10, Drew Parsons wrote:
>> > We need to change h5py to support hdf5-mpi.  h5py is somewhat crippled
>> > as serial-only.
>>
>> I didn't even notice that since my use case for hdf5 is light-weight.
>> (training data is fully loaded from hdf5 into memory)
>
> Same here. My use case for h5py is for storing medical images and raw
> data, all of which usually fit into a single workstation.

Reasonable to keep the hfd5-serial version then.

It sounds like your use-case is post-processing of data. The use-case I
have in mind is use of h5py during computation, e.g. supplementing
FEniCS jobs. (FEniCS itself uses hdf5-mpi in its C++ backend library).  
Using cluster calculations for instance, cloud computing.


>> > We could just do it straight away in python3-h5py.  Is there much
>> > point having h5py support both hdf5-serial and hdf5-mpi?  Perhaps
>> > there is, in which case we need to set up multiple builds and use
>> > alternatives to set the preferred h5py.
...
> Are we really sure alternatives are appropriate for this particular use
> case?
>
> Python has got other means for injecting alternative dependencies such
> as PYTHONPATH and virtualenvs.

PYTHONPATH is a solution for individual users to override the system
default, but it's policy not to rely on env variables for the system
installation.  I'm not familiar with virtualenvs. I gather its also a
user-mechanism to override the default configuration.

So the question is, which h5py (which hdf5) 'import h5py' should be
working with by default.  For cloud computing installations, hdf5-mpi
makes sense. Even for workstations, most are multi-cpu these days.

It's not so hard to setup up alternatives links to point
/usr/lib/python3/dist-packages/h5py at h5py-serial or h5py-mpi.  I've
done it for the real and complex variants of petsc4py.  For additional
entertainment h5py-serial and h5py-mpi could be installed alongside each
other in the normal python modules directory, which means one could
consider 'import h5py-serial' or 'import h5py-mpi' directly. I think
that operates with the effect of 'import h5py as h5py-serial' but might
not be robust. A robust approach would place the h5py-serial and
h5py-mpi directories elsewhere, where a user's PYTHONPATH could specify
them independently of the default.


>> > A related question, is there much point setting up support for
>> > hdf5-mpich as well as hdf5-openmpi?  Increasing build and
>> > package-alternatives complexity, but once it's done once to
>> > distinguish hdf5-serial from hdf5-mpi, it's not that much more work to
>> > also split hdf5-mpi between hdf5-mpich and hdf5-openmpi.
>>
>> My personal opinion is to just choose a reasonable default,
>> unless users shouted for that.
>
> Same here.
>
> We can't catter to every use case in the scientific community, so the
> best we can do is choose something sensible with the data point we
> have got (if any) and later reconsider with users feedback.

True, supporting the alternative mpi is not our highest priority. Though
I often find our upstream developers cursing at openmpi. They do that
every 2 months or so in different upstream projects.  We can consider
mpich a "wishlist" issue.  As you point out it takes more resources to
support, and our time is limited.

Drew

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons-3
In reply to this post by Steffen Möller
On 2019-08-13 03:51, Steffen Möller wrote:

> Hello,
>
>
> There are a few data formats in bioinformatics now depending on hdf5
> and
> h5py is used a lot. My main concern is that the user should not need to
> configure anything, like a set of hostnames. And there should not be
> anything stalling since it waiting for contacting a server. MPI needs
> to
> be completely transparent and then I would very much like to see it.

MPI is generally good that way.  The programs runs directly as a simple
serial program if you run it on its own, so in that sense it should be
transparent to the user (i.e. you won't know its mpi-enabled unless you
know to look for it).  A multicpu job is launched via running the
program with mpirun (or mpiexec).

e.g. in the context of python and h5py, if you run
   python3 -c 'import h5py'
then the job runs as a serial job, regardless of whether h5py is built
for hdf5-serial or hdf5-mpi.

If you want to run on 4 cpus, you launch the same program with
   mpirun -n 4 python3 -c 'import h5py'

Then if h5py is available with hdf5-mpi, it handles hdf5 as a
multiprocessor job.  If h5py here is built with hdf5-serial, then it
runs the same serial job 4 times at the same time.

To reiterate, having h5py-mpi available will be transparent to a user
interacting with hdf as a serial library. It doesn't break serial use,
it just provides the capability to also run multicpu jobs.


> How do autotests work for MPI?

We simply configure the test script to invoke the same tests using
mpirun.

Drew

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Alastair McKinstry-3

On 13/08/2019 05:01, Drew Parsons wrote:

> On 2019-08-13 03:51, Steffen Möller wrote:
>> Hello,
>>
>>
>> There are a few data formats in bioinformatics now depending on hdf5 and
>> h5py is used a lot. My main concern is that the user should not need to
>> configure anything, like a set of hostnames. And there should not be
>> anything stalling since it waiting for contacting a server. MPI needs to
>> be completely transparent and then I would very much like to see it.
>
> MPI is generally good that way.  The programs runs directly as a
> simple serial program if you run it on its own, so in that sense it
> should be transparent to the user (i.e. you won't know its mpi-enabled
> unless you know to look for it).  A multicpu job is launched via
> running the program with mpirun (or mpiexec).
>
> e.g. in the context of python and h5py, if you run
>   python3 -c 'import h5py'
> then the job runs as a serial job, regardless of whether h5py is built
> for hdf5-serial or hdf5-mpi.
>
> If you want to run on 4 cpus, you launch the same program with
>   mpirun -n 4 python3 -c 'import h5py'
>
> Then if h5py is available with hdf5-mpi, it handles hdf5 as a
> multiprocessor job.  If h5py here is built with hdf5-serial, then it
> runs the same serial job 4 times at the same time.
>
> To reiterate, having h5py-mpi available will be transparent to a user
> interacting with hdf as a serial library. It doesn't break serial use,
> it just provides the capability to also run multicpu jobs.
>
I'd go with this policy in general:  codes available as both serial and
mpi should probably be shipped mpi by default.

The main reason not to do so is normally "it drags in MPI" and "its
painful to build", but these are arguments against an end-user having to
build all the software; the advantage of Debian is the stack is
available for free :-) . Typically space for the MPI libraries is not an
issue.

At the moment the main exception is NetCDF : serial and parallel NetCDF
have orthogonal features: the MPI version provides parallelism but only
the serial version provides compression with I/O, (because I/O writes
happen on byte ranges via POSIX). This is changing though (not sure of
the timetable); in the future a parallel version with full features is
expected.

>
>> How do autotests work for MPI?
>
> We simply configure the test script to invoke the same tests using
> mpirun.
>
This is a bigger issue.  We have test suites that test MPI features
without checking MPI processor counts (eg the Magics /Metview code). One
workaround is to enable oversubscribe to allow the test to work
(inefficiently), though the suites that use MPI should really detect and
disable such tests if resources are not found. We will always have
features in our codes that our build/test systems aren't capable of
testing: eg. pmix is designed to work scalably to > 100,000 cores. We
can't test that :-)
> Drew
>
Alastair


--
Alastair McKinstry, <[hidden email]>, <[hidden email]>, https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered.

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons
On 2019-08-13 18:47, Alastair McKinstry wrote:
>>
> I'd go with this policy in general:  codes available as both serial
> and mpi should probably be shipped mpi by default.

This is certainly the simplest approach.  It's a 2-line edit:
change Build-Depends: libhdf5-dev to libhdf5-mpi-dev.
and add --mpi to the h5py configure step.

In principle nothing else needs to change.

I suggest we try this first and monitor how it performs for our serial
users.  If it proves to be causing problems then we can we can proceed
with the alternatives option of providing both h5py-serial and h5py-mpi
(with or without mpich).

Shall we go ahead with Alastair's "minimal" change now, or should we
discuss further?

Drew

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Mo Zhou
On 2019-08-14 03:59, Drew Parsons wrote:
> Shall we go ahead with Alastair's "minimal" change now, or should we
> discuss further?

If the MPI build can correctly work for the serial use case,
I vote for the "minimal" change.

I roughly went through the h5py documentation and found
nothing special about the --mpi build.

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Steffen Möller
In reply to this post by Drew Parsons-3

On 13.08.19 06:01, Drew Parsons wrote:

> On 2019-08-13 03:51, Steffen Möller wrote:
>> Hello,
>>
>>
>> There are a few data formats in bioinformatics now depending on hdf5 and
>> h5py is used a lot. My main concern is that the user should not need to
>> configure anything, like a set of hostnames. And there should not be
>> anything stalling since it waiting for contacting a server. MPI needs to
>> be completely transparent and then I would very much like to see it.
>
> MPI is generally good that way.  The programs runs directly as a
> simple serial program if you run it on its own, so in that sense it
> should be transparent to the user (i.e. you won't know its mpi-enabled
> unless you know to look for it).  A multicpu job is launched via
> running the program with mpirun (or mpiexec).
>
> e.g. in the context of python and h5py, if you run
>   python3 -c 'import h5py'
> then the job runs as a serial job, regardless of whether h5py is built
> for hdf5-serial or hdf5-mpi.
>
> If you want to run on 4 cpus, you launch the same program with
>   mpirun -n 4 python3 -c 'import h5py'
>
> Then if h5py is available with hdf5-mpi, it handles hdf5 as a
> multiprocessor job.  If h5py here is built with hdf5-serial, then it
> runs the same serial job 4 times at the same time.
>
> To reiterate, having h5py-mpi available will be transparent to a user
> interacting with hdf as a serial library. It doesn't break serial use,
> it just provides the capability to also run multicpu jobs.


This sounds like an omission not to feature, then. Please go for it.


>> How do autotests work for MPI?
> We simply configure the test script to invoke the same tests using
> mpirun.

I am somewhat uncertain that Debian needs to be the instance testing
this. But given all the hick-ups that are possibly introduced by
parallelization - would be good to test it. And Debian should then take
some pride in it and announce that.

Does Debian have any mechanisms to indicate that a software can run in
parallel? I am thinking about all the automation that now controls
workflows - like toil and/or cwl - or the testing of reverse
dependencies on some buildd. These can check for the presence for a
binary but don't immediately know if they should start it with mpirun.

Best,

Steffen


Reply | Threaded
Open this post in threaded view
|

Testing parallel execution Re: h5py and hdf5-mpi

Steffen Möller
In reply to this post by Alastair McKinstry-3

>>
>>> How do autotests work for MPI?
>>
>> We simply configure the test script to invoke the same tests using
>> mpirun.
>>
> This is a bigger issue.  We have test suites that test MPI features
> without checking MPI processor counts (eg the Magics /Metview code).
> One workaround is to enable oversubscribe to allow the test to work
> (inefficiently), though the suites that use MPI should really detect
> and disable such tests if resources are not found. We will always have
> features in our codes that our build/test systems aren't capable of
> testing: eg. pmix is designed to work scalably to > 100,000 cores. We
> can't test that :-)

Maybe the testing for many cores does not need to happen at upload time.
And maybe the testing for behavior in parallel environments does need to
be performed for all platforms but just one. There could then be a
service Debian provides, analogously to reproducible builds etc,  that
performs testing in parallel environments. The unknown limits of
available cores is something the users of
better-than-what-Debian-decides-to-afford infrastructure can address
themselves. The uploader of a package/build demons would just invoke the
parallel run on a single node. Personally, I would like to see multiple
tests, say consecutively on 1,2,4,8,16,32,64,128,256 nodes and stop
testing when there is no more speedup. How many packages would reach
beyond 32?

There are quite some packages in our distro that are multithreaded, i.e.
that don't need mpi. Today, we don't test their performance in parallel
either. But we should. Don't have any systematic way to do so, yet,
though. I could also imagine that such a testing in parallel
environments help gluing our distro with upstream developers a bit more.
Maybe this is something to discuss together with the cloud team who know
how to spawn an arbitrary number of nodes, quickly? And maybe have an
outreach to phoronix.com and/or their openbenchmarking.org?

Steffen

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons-3
In reply to this post by Steffen Möller
On 2019-08-14 18:05, Steffen Möller wrote:
> On 13.08.19 06:01, Drew Parsons wrote:
>>
>> To reiterate, having h5py-mpi available will be transparent to a user
>> interacting with hdf as a serial library. It doesn't break serial use,
>> it just provides the capability to also run multicpu jobs.
>
>
> This sounds like an omission not to feature, then. Please go for it.
>

h5py 2.9.0-3 will migrate to testing in a day or two, we can proceed
with the mpi then.


>>> How do autotests work for MPI?
>> We simply configure the test script to invoke the same tests using
>> mpirun.
>
> I am somewhat uncertain that Debian needs to be the instance testing
> this. But given all the hick-ups that are possibly introduced by
> parallelization - would be good to test it. And Debian should then take
> some pride in it and announce that.

Once we've got mpi activated in h5py, we can check whether the
parallelisation does in fact improve your own workflow. Even on a laptop
or desktop, most come with at least 4 cpus these days. Even mobile
phones.  Do you deal with GB-size hdf5 datasets, data for which access
time is noticeable?  Ideally your data handling will speed up according
to the number of cpus added.

I don't think switching on mpi in h5py is itself such a big deal.  But
if we can demonstrate that it measurably improves performance for a real
workflow, then that is worth crowing about.

> Does Debian have any mechanisms to indicate that a software can run in
> parallel? I am thinking about all the automation that now controls
> workflows - like toil and/or cwl - or the testing of reverse
> dependencies on some buildd. These can check for the presence for a
> binary but don't immediately know if they should start it with mpirun.

No specific mechanism, since normally we known if the program is
intended to be mpi enabled or not.

But at the level of the package, we can look at dependencies, e.g.
   apt-cache depends python3-scipy | grep mpi
   apt-cache depends python3-dolfin | grep mpi

At the level of a given library or executable, objdump can be helpful,
e.g.
   objdump -p /usr/lib/x86_64-linux-gnu/libsuperlu.so | grep mpi
   objdump -p /usr/lib/x86_64-linux-gnu/libsuperlu_dist.so | grep mpi

For autopkgtest, it's our own tests so we already know if the program is
compiled with mpi or not. It wouldn't really make sense for the scripts
in debian/tests to check whether the program being tested was compiled
with mpi.

Drew

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

James Tocknell-2
The main difference between h5py build against a serial version of
HDF5 vs the mpi version is that h5py built against mpi HDF5 can use
the mpi-specific tooling (such as collective IO) - you can use a
serial h5py with MPI and there's no problems other than constraints
that MPI imposes on any IO, so the big difference would be writing to
files not reading reading them. I'd suggest switching on the MPI
support and checking back in a year to see if there have been any
major issues (the MPI tooling is currently not tested that well
currently, so there may be bugs), and making a longer term call then.
Last time I looked at the state of h5py and HDF5 across the different
distros, most only packaged one version of HDF5 (either serial or
mpi), and RedHat (and derivs) used their module system (which handled
moving between different backends, and with different installs of
h5py). I'd keep one version of h5py in the archive, and chose a HDF5
to build against, rather than try playing with alternatives or
anything.

James (one of the h5py devs)

On Wed, 14 Aug 2019 at 20:47, Drew Parsons <[hidden email]> wrote:

>
> On 2019-08-14 18:05, Steffen Möller wrote:
> > On 13.08.19 06:01, Drew Parsons wrote:
> >>
> >> To reiterate, having h5py-mpi available will be transparent to a user
> >> interacting with hdf as a serial library. It doesn't break serial use,
> >> it just provides the capability to also run multicpu jobs.
> >
> >
> > This sounds like an omission not to feature, then. Please go for it.
> >
>
> h5py 2.9.0-3 will migrate to testing in a day or two, we can proceed
> with the mpi then.
>
>
> >>> How do autotests work for MPI?
> >> We simply configure the test script to invoke the same tests using
> >> mpirun.
> >
> > I am somewhat uncertain that Debian needs to be the instance testing
> > this. But given all the hick-ups that are possibly introduced by
> > parallelization - would be good to test it. And Debian should then take
> > some pride in it and announce that.
>
> Once we've got mpi activated in h5py, we can check whether the
> parallelisation does in fact improve your own workflow. Even on a laptop
> or desktop, most come with at least 4 cpus these days. Even mobile
> phones.  Do you deal with GB-size hdf5 datasets, data for which access
> time is noticeable?  Ideally your data handling will speed up according
> to the number of cpus added.
>
> I don't think switching on mpi in h5py is itself such a big deal.  But
> if we can demonstrate that it measurably improves performance for a real
> workflow, then that is worth crowing about.
>
> > Does Debian have any mechanisms to indicate that a software can run in
> > parallel? I am thinking about all the automation that now controls
> > workflows - like toil and/or cwl - or the testing of reverse
> > dependencies on some buildd. These can check for the presence for a
> > binary but don't immediately know if they should start it with mpirun.
>
> No specific mechanism, since normally we known if the program is
> intended to be mpi enabled or not.
>
> But at the level of the package, we can look at dependencies, e.g.
>    apt-cache depends python3-scipy | grep mpi
>    apt-cache depends python3-dolfin | grep mpi
>
> At the level of a given library or executable, objdump can be helpful,
> e.g.
>    objdump -p /usr/lib/x86_64-linux-gnu/libsuperlu.so | grep mpi
>    objdump -p /usr/lib/x86_64-linux-gnu/libsuperlu_dist.so | grep mpi
>
> For autopkgtest, it's our own tests so we already know if the program is
> compiled with mpi or not. It wouldn't really make sense for the scripts
> in debian/tests to check whether the program being tested was compiled
> with mpi.
>
> Drew
>


--
Don't send me files in proprietary formats (.doc(x), .xls, .ppt etc.).
It isn't good enough for Tim Berners-Lee, and it isn't good enough for
me either. For more information visit
http://www.gnu.org/philosophy/no-word-attachments.html.

Truly great madness cannot be achieved without significant intelligence.
 - Henrik Tikkanen

If you're not messing with your sanity, you're not having fun.
 - James Tocknell

In theory, there is no difference between theory and practice; In
practice, there is.

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons-3
On 2019-08-14 20:46, James Tocknell wrote:

> The main difference between h5py build against a serial version of
> HDF5 vs the mpi version is that h5py built against mpi HDF5 can use
> the mpi-specific tooling (such as collective IO) - you can use a
> serial h5py with MPI and there's no problems other than constraints
> that MPI imposes on any IO, so the big difference would be writing to
> files not reading reading them. I'd suggest switching on the MPI
> support and checking back in a year to see if there have been any
> major issues (the MPI tooling is currently not tested that well
> currently, so there may be bugs), and making a longer term call then.
> Last time I looked at the state of h5py and HDF5 across the different
> distros, most only packaged one version of HDF5 (either serial or
> mpi), and RedHat (and derivs) used their module system (which handled
> moving between different backends, and with different installs of
> h5py). I'd keep one version of h5py in the archive, and chose a HDF5
> to build against, rather than try playing with alternatives or
> anything.
>
> James (one of the h5py devs)


Thanks James.  Sounds like just building against libhdf5-mpi is the the
thing to do.  Hopefully no dire bugs.

Drew

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons-3
On 2019-08-15 15:02, Drew Parsons wrote:

> On 2019-08-14 20:46, James Tocknell wrote:
>> The main difference between h5py build against a serial version of
>> HDF5 vs the mpi version is that h5py built against mpi HDF5 can use
>> the mpi-specific tooling (such as collective IO) - you can use a
>> serial h5py with MPI and there's no problems other than constraints
>> that MPI imposes on any IO, so the big difference would be writing to
>> files not reading reading them. I'd suggest switching on the MPI
>> support and checking back in a year to see if there have been any
>> major issues (the MPI tooling is currently not tested that well
>> currently, so there may be bugs), and making a longer term call then.
>> Last time I looked at the state of h5py and HDF5 across the different
>> distros, most only packaged one version of HDF5 (either serial or
>> mpi), and RedHat (and derivs) used their module system (which handled
>> moving between different backends, and with different installs of
>> h5py). I'd keep one version of h5py in the archive, and chose a HDF5
>> to build against, rather than try playing with alternatives or
>> anything.
>>
>> James (one of the h5py devs)
>
>
> Thanks James.  Sounds like just building against libhdf5-mpi is the
> the thing to do.  Hopefully no dire bugs.


I've uploaded h5py 2.9.0-5 to unstable.

There was an error in tests with mpi:
$ mpirun -n 2 python3 -c "import h5py.tests;
h5py.tests.run_tests(verbosity=2)"
...
test_close_multiple_default_driver (h5py.tests.old.test_file.TestClose)
... ok
test_close_multiple_mpio_driver (h5py.tests.old.test_file.TestClose)
MPIO driver and options ... mca_fbtl_posix_pwritev: error in writev:Bad
file descriptor


It happens because mktemp() is used to generate a test file name, but
the name is not parallelised. So a different name was given to the h5py
File object on each process, confusing the File when it's set for mpio.

I reported with a patch at https://github.com/h5py/h5py/issues/1285

Drew

Reply | Threaded
Open this post in threaded view
|

RE:h5py and hdf5-mpi

PICCA Frederic-Emmanuel
Hello,

Since this upload two of my packages have autopkgtest regressions (pyfai and silx)

It seems that the python2.7-dbg failed with

  File "/usr/lib/python2.7/dist-packages/silx/io/__init__.py", line 37, in <module>
    from .utils import open  # pylint:disable=redefined-builtin
  File "/usr/lib/python2.7/dist-packages/silx/io/utils.py", line 44, in <module>
    import h5py
  File "/usr/lib/python2.7/dist-packages/h5py/__init__.py", line 26, in <module>
    from . import _errors
ImportError: cannot import name _errors

Did you changed something in the debug package ?


I also looked at your autopkgtest, you deactivated the python3-dbg tests.
Since my package are testing also the -dbg implementation, it is an issue to not test the one from h5py.
We also do not hide problems to our users.

>From my experience, even if these packges are not used a lot, it can triggers real bugs.
So it improve the overall quality of the software.

Can you explain what is the issue with the python3-dbg.

I got hurted by a numpy bug myself, only visible from the -dbg python interpreter, like this one

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=933056


Cheers


Frederic
Reply | Threaded
Open this post in threaded view
|

RE:h5py and hdf5-mpi

PICCA Frederic-Emmanuel
It seems that the mpi4py.MPI is not available in -dbg packages.
 

======================================================================
ERROR: testOverwrite (XRFBatchFitOutputTest.testXRFBatchFitOutput)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/PyMca5/tests/XRFBatchFitOutputTest.py", line 77, in testOverwrite
    outbuffer = self._initOutBuffer(**self.saveall)
  File "/usr/lib/python3/dist-packages/PyMca5/tests/XRFBatchFitOutputTest.py", line 205, in _initOutBuffer
    from PyMca5.PyMcaPhysics.xrf.XRFBatchFitOutput import OutputBuffer
  File "/usr/lib/python3/dist-packages/PyMca5/PyMcaPhysics/xrf/XRFBatchFitOutput.py", line 47, in <module>
    from PyMca5.PyMcaIO import NexusUtils
  File "/usr/lib/python3/dist-packages/PyMca5/PyMcaIO/NexusUtils.py", line 34, in <module>
    import h5py
  File "/usr/lib/python3/dist-packages/h5py/__init__.py", line 26, in <module>
    from . import _errors
  File "MPI.pxd", line 28, in init h5py._errors
ModuleNotFoundError: No module named 'mpi4py.MPI'

Reply | Threaded
Open this post in threaded view
|

RE:h5py and hdf5-mpi

PICCA Frederic-Emmanuel
Ok, I found your work on mpi4py, thanks a lot.


I have just one interrogation, you Suggests: python-numpy-dbg.
I am wondering if this should not be a Depends.
if the code import numpy, ou need to have pyton-numpy-dbg installed.

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons
On 2019-08-21 00:06, PICCA Frederic-Emmanuel wrote:
> Ok, I found your work on mpi4py, thanks a lot.

That's right. I've hidden the dbg tests for h5py temporarily waiting for
the new python3-mpi4py-dbg packages to get processed.  Often they can be
swift processing if it's an adjustment to an existing package.  h5py dbg
tests work with it, hopefully it will restore your packages back to good
order too otherwise we'll need to dig deeper.


>
> I have just one interrogation, you Suggests: python-numpy-dbg.
> I am wondering if this should not be a Depends.
> if the code import numpy, ou need to have pyton-numpy-dbg installed.

You mean mpi4py here right, not h5py?  May well be the case.  I was
hoping to make h5py Recommends python3-mpi4py, but had to make it
Depends: for that kind of reason. It'll test the mpi4py packages with
numpy deinstalled to check. I'll wait till the new package is accepted,
it will need a source-only upload anyway to migrate to testing.

Drew

Reply | Threaded
Open this post in threaded view
|

Re: h5py and hdf5-mpi

Drew Parsons
On 2019-08-21 03:10, Drew Parsons wrote:
> On 2019-08-21 00:06, PICCA Frederic-Emmanuel wrote:
>> Ok, I found your work on mpi4py, thanks a lot.
>
> That's right. I've hidden the dbg tests for h5py temporarily waiting
> for the new python3-mpi4py-dbg packages to get processed.  Often they
> can be swift processing if it's an adjustment to an existing package.
> h5py dbg tests work with it, hopefully it will restore your packages
> back to good order too otherwise we'll need to dig deeper.

That said, it looks like the pyfai failure is not due to the dbg build
of mpi4py. silx yes, but pyfai has a different error. Are you able to
double check pyfai tests manually?

xmds2 also has complaints that might not be resolved by mpi4py-dbg.  
cc:ing Rafael. Rafael, I've built h5py 2.9.0-6 with MPI support. It
doesn't work with python3-dbg since that needs mpi4py build with -dbg
(it's in the NEW queue). But xmds2 tests seem to have a different
complaint not related to python-dbg.

Drew