Aborting all plan about deep learning frameworks.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Aborting all plan about deep learning frameworks.

Mo Zhou
Hi Science Team,

Sorry for the depressing mail subject. Some of you may have an
impression on my past endeavor to introduce some deep learning
software to Debian, including TensorFlow etc. But now I cannot
bear them anymore.

Indeed my research field is computer vision and deep learning,
and indeed I'm still running PyTorch for experiments when I'm
writing this email. But all these experiments, are powered by tons
of non-free software and data, including pre-trained neural networks
and performance library blobs. In fact, if someone asks me, how
to install TensorFlow/PyTorch on Debian, my answer is definitely
either "use Anaconda" or "use upstream wheel", because they don't
lack in performance and commercial support. In contrast, my
motivation to package them is in fact to "have fun" and
"learn something". Now I'm tired with packaging those rapidly
developing stuff, and want to offload burden from myself.
I don't regret aborting all the plan.

Anyway, my interest in debian-related work is continuously changing.
This email doesn't mean that I'm fading out from this team, but
mean I'm moving to work on some more useful and valuable packages.
This mail is a little bit sad. But I fell in relief by throwing
them away.

P.S. The deeper I dig into Debian's bugs, the more high-popcon
buggy packages would emerge...... Obviously fixing important
stuff is more valuable than introducing new toys.

Happy hacking!

Reply | Threaded
Open this post in threaded view
|

Re: Aborting all plan about deep learning frameworks.

Stephen Sinclair
It's true that this field is rapidly moving these days and it's hard
to keep up with upstream releases.  My interest in taking the Keras
and Lasagne packages was mainly to help provide stable and well-tested
targets for these APIs, as well as to learn about packaging.  Two
things that surprised me shortly after I adopted where Theano going
EOL, and Keras getting packed into TensorFlow's upstream.  Meanwhile,
you are right that the vast majority of users will always install via
pip, in order to have the latest version.  However, it also means that
Theano will no longer be *changing*, and as long as Keras still passes
tests using it as a backend I see no reason to remove it.  In fact the
Keras Debian package can provide a nice way to install and use Keras
without a TensorFlow dependency which can be a great thing for
developers who want to integrate neural networks into their software
in a stable manner.

In that sense I personally think that the goal of packaging DL
software in Debian should focus around stability.  While researchers
will most often use the latest and greatest releases and features, the
speed with which releases come also prohibits building software on top
of these engines because when an API is a moving target it's hard to
build on top of.  So I think there is in fact some benefit that can be
derived from providing packages of LTS versions, and eventually to
package software which uses them and can be relied on as "available"
via apt-get and not represent a moving target.  When engines like
TensorFlow get updates, the Debian ecosystem can provide a great way
to automatically check that dependent software is not broken, *before*
updates get deployed to end users, and I think this can be a huge
benefit provided by this community.

We will undoubtedly start to see more software coming with built-in
ML-powered parts in the next years and it would be great to be able to
provide a stable and slower-moving platform in which these can be
developed and maintained.


Steve


On Sun, Nov 4, 2018 at 5:00 PM Mo Zhou <[hidden email]> wrote:

>
> Hi Science Team,
>
> Sorry for the depressing mail subject. Some of you may have an
> impression on my past endeavor to introduce some deep learning
> software to Debian, including TensorFlow etc. But now I cannot
> bear them anymore.
>
> Indeed my research field is computer vision and deep learning,
> and indeed I'm still running PyTorch for experiments when I'm
> writing this email. But all these experiments, are powered by tons
> of non-free software and data, including pre-trained neural networks
> and performance library blobs. In fact, if someone asks me, how
> to install TensorFlow/PyTorch on Debian, my answer is definitely
> either "use Anaconda" or "use upstream wheel", because they don't
> lack in performance and commercial support. In contrast, my
> motivation to package them is in fact to "have fun" and
> "learn something". Now I'm tired with packaging those rapidly
> developing stuff, and want to offload burden from myself.
> I don't regret aborting all the plan.
>
> Anyway, my interest in debian-related work is continuously changing.
> This email doesn't mean that I'm fading out from this team, but
> mean I'm moving to work on some more useful and valuable packages.
> This mail is a little bit sad. But I fell in relief by throwing
> them away.
>
> P.S. The deeper I dig into Debian's bugs, the more high-popcon
> buggy packages would emerge...... Obviously fixing important
> stuff is more valuable than introducing new toys.
>
> Happy hacking!
>

Reply | Threaded
Open this post in threaded view
|

Re: Aborting all plan about deep learning frameworks.

Mo Zhou
On Tue, Nov 06, 2018 at 12:18:48PM +0100, Stephen Sinclair wrote:

> It's true that this field is rapidly moving these days and it's hard
> to keep up with upstream releases.  My interest in taking the Keras
> and Lasagne packages was mainly to help provide stable and well-tested
> targets for these APIs, as well as to learn about packaging.  Two
> things that surprised me shortly after I adopted where Theano going
> EOL, and Keras getting packed into TensorFlow's upstream.  Meanwhile,
> you are right that the vast majority of users will always install via
> pip, in order to have the latest version.  However, it also means that
> Theano will no longer be *changing*, and as long as Keras still passes
> tests using it as a backend I see no reason to remove it.  In fact the
> Keras Debian package can provide a nice way to install and use Keras
> without a TensorFlow dependency which can be a great thing for
> developers who want to integrate neural networks into their software
> in a stable manner.

The status of Caffe is similar to Theano -- it's very stable that is
nearly frozen. Stable code such as Theano and Caffe are easy to tackle
with. But old frameworks are really old. They don't have easy-to-use
automatic differentiation and dynamic computation graph.  TensorFlow
sucks before it started to support dynamic graph.

In Debian's context, our work on some fundamental libraries can benefit
more people, such as intel-mkl and mkl-dnn. I believe my intel-mkl work
benenits a large amount of packages wherever there is linear algebra.
Such common libraries are what I'm presently taking care of.

PyTorch/Caffe 1.0 will be released soon, I'm still interested in giving it
a try. Hopefully if upstream wrote a friendly build system, I'm willing
to maintain a CPU version. CPU version of PyTorch/Caffe2 makes full
sense for Debian because of it's automatic differentiation functionality.
 

> In that sense I personally think that the goal of packaging DL
> software in Debian should focus around stability.  While researchers
> will most often use the latest and greatest releases and features, the
> speed with which releases come also prohibits building software on top
> of these engines because when an API is a moving target it's hard to
> build on top of.  So I think there is in fact some benefit that can be
> derived from providing packages of LTS versions, and eventually to
> package software which uses them and can be relied on as "available"
> via apt-get and not represent a moving target.  When engines like
> TensorFlow get updates, the Debian ecosystem can provide a great way
> to automatically check that dependent software is not broken, *before*
> updates get deployed to end users, and I think this can be a huge
> benefit provided by this community.

Most of state-of-the-art frameworks are becoming more and more stable.
Including TensorFlow and PyTorch. Actually TensorFlow 1.X series
is exactly an LTS branch.

I'm not a fan of TF. My initial TensorFlow packaging is totally for fun
-- to see whether I can manually compile that stuff without bazel. My
build system still needs some work to make the whole source package
useful. The internal shared object for python can be compiled by my
build system already, but the python interface generation part is still
buggy. TF useless to me, it is more likely a CPU benchmarker to me in
terms of compiling C++.

As for applications ...  Several months ago I raised a discussion on
-devel, talking about software freedom and deep learning applications.
There are still unsolved important problems.

My attitude towards those applications is "wait and see".

> We will undoubtedly start to see more software coming with built-in
> ML-powered parts in the next years and it would be great to be able to
> provide a stable and slower-moving platform in which these can be
> developed and maintained.
 
I agree, but Debian will need more man power for them. IIRC there are
only 4 people who care about this field, including you.