Copyright concerns regarding Seafile

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Copyright concerns regarding Seafile

Jan-Henrik Haukeland

We ask Debian to consider removing and stop distributing Seafile packages [1]  due to copyright concerns.

Background:
-----------------

Seafile is an open-source dropbox clone created by a team from China. Around 2013 they needed MySQL and PostgreSQL support and started using our open-source database connection pool library, libzdb [2].  

In 2014 a push was made to include Seafile in Debian and a discussion about copyright concerns in Seafile started on GitHub [3]. Libzdb played a role in this discussion and one of the results were that Seafile in 2016 removed the dependency on libzdb and stated that “we completely replaced libzdb with our own code.” [4]  Seafile has since been included in Debian [1].

Concern:
------------

We later discovered that the code that replaced libzdb is mostly a copy of libzdb's code and structures. This stand in contrast to the statement “we completely replaced libzdb with our own code.”  [4]

Libzdb is licensed under GPLv3. Copying and modifying GPL code is perfectly fine as long as the original copyright notice and license are kept. Unfortunately, this is not what the Seafile team did. Instead they copied code from libzdb, removed the copyright notice, claimed the code as their own and re-license it under another license.

Evidence:
-------------

To do a side by side comparison I’m going to use Seafile’s version of libzdb which they forked on GitHub [6] at version 2.11.1 and based their new code on and claimed as their own [5]. The comparison is going to be against our same version on Bitbucket.

Libzdb is a database connection pool library which consists of 4 major components: ConnectionPool, Connection, ResultSet and PreparedStatement. Forward declared data structures are used to abstract the concrete database implementation. These components together with their method association are quite unique for a _C_ database connection pool library as far as I know. The Seafile "rewrite" uses the exact same components and method association between components with modest renaming. Mostly by going from camel-case to snake-case.

I’m going to limit the comparison somewhat for brevity, but it should be enough to demonstrate copyright concern. The full comparison can be done by comparing [5] and [7].


1. The Connection Pool has two significant methods:

- Get a connection from the pool

a:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/ConnectionPool.c#lines-314

a:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/db-wrapper.c#L180

- And return a connection to the pool

b:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/ConnectionPool.c#lines-345

b:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/db-wrapper.c#L220

Apart from Seafile using glib array and libzdb using its own vector module the above demonstrate copy of code with the same logic, method and variable names. Libzdb’s Connection_setAvailable is equal to their conn->is_available = FALSE; And our LOCK macro is just pthread_mutex_lock. I.e. the same code and logic, just expanded and inlined.


2. Connection

In libzdb a Connection has three significant  methods, Connection_execute, Connection_executeQuery and Connection_prepareStatement. Seafile has the same methods implemented in the same way

a:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/Connection.c#lines-308

a:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/db-wrapper.c#L258

b:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/Connection.c#lines-323

b:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/db-wrapper.c#L350

Seafile has not copied all methods from libzdb’s Connection, but Connection_ping is is there as well as Connection_beginTransaction, Connection_rollback and Connection_commit

c:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/Connection.c#lines-228

c:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/db-wrapper.c#L241

What is special about our transaction code in libzdb is that we keep a counter called “isInTransaction” which Seafile has as “in_transaction”.

d:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/Connection.c#lines-252

d:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/db-wrapper.c#L424


3. ResultSet and PreparedStatment are also clearly copied from libzdb. We see ResultSet_next, ResultSet_getString, ResultSet_getInt etc and PreparedStatement_setString, PreparedStatement_setInt etc. Also PreparedStatement_executeQuery is faithfully copied:

a:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/PreparedStatement.c#lines-122

a:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/db-wrapper.c#L392


4. Concrete Database implementations.

When it comes to the concrete database implementation for SQLite, MySQL and PostgreSQL the same copy of code is repeated. For example, MysqlResultSet_new.

a:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/mysql/MysqlResultSet.c#lines-102

a:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/mysql-db-ops.c#L189

and the special way we ensure column field capacity in MySQL where they very telling even has copied our comment:

b:libzdb: https://bitbucket.org/tildeslash/libzdb/src/2958e023fcee44f313e6d3f3592b02cc06783e0f/src/db/mysql/MysqlResultSet.c#lines-84

b:seafile: https://github.com/haiwen/seafile-server/blob/9f30eedc467bf5938ff57e24cee3a5b473e72314/common/db-wrapper/mysql-db-ops.c#L277


Summary:
--------------

The evidence above demonstrate that there are reasons to be concerned about the Seafile team's insubstantial dealings in open-source and that the Seafile team for all practical purposes are conducting copyright infringement and violating the GPL terms. It is unclear to me if the Seafile server is part of Debian or if it is downloaded separately or during the install process and that Debian is only distributing the client part of Seafile. If the latter is the case, I still hope that Debian will make a stand and not distribute Seafile packages as long as there are copyright concerns associated with the Seafile Software.

Best regards
—  
Jan-Henrik Haukeland
https://tildeslash.com/ 


1. https://packages.debian.org/search?keywords=seafile
2. https://www.tildeslash.com/libzdb/
3. https://github.com/haiwen/seafile/issues/666
4. https://github.com/haiwen/seafile/issues/666#issuecomment-260232869
5. https://github.com/haiwen/seafile-server/tree/master/common/db-wrapper
6. Seafile’s fork of libzdb https://github.com/haiwen/libzdb
7. Our libzdb repository: https://bitbucket.org/tildeslash/libzdb/src/release-2-11-1/

Reply | Threaded
Open this post in threaded view
|

Re: Copyright concerns regarding Seafile

Mihai Moldovan
Hi

Since no one has answered so far, I feel free to chime in.


* On 5/12/19 9:39 PM, Jan-Henrik Haukeland wrote:
>
> We ask Debian to consider removing and stop distributing Seafile packages [1]
> due to copyright concerns. [...]

First of all, thank you for your in-depth analysis and bringing that issue to
the Debian project's attention!

Note that this list does not have any legal leverage, however. Most people
subscribed to it are just software developers (most of which are more deeply
involved in Debian) discussing licensing (and related things), but not actual
lawyers.

In case of license violations, the proper procedure is to file a bug report
against the source package(s) in question. Package maintainers will handle that
and request package removal by ftpmaster - the latter of which have the final
say in what the archive is made up of (as far as I know).


> Summary:
> --------------
>
> The evidence above demonstrate that there are reasons to be concerned about
> the Seafile team's insubstantial dealings in open-source and that the Seafile
> team for all practical purposes are conducting copyright infringement and
> violating the GPL terms.

I have only skimmed the provided examples, but I would generally agree. It's not
a blatant, mindless copy of your code, though, which makes things a bit
complicated. Most of the referenced functions are rather short. Seafile's DB
interface also isn't uncommon for C code that tries to provide a common
interface with multiple implementations (i.e., structures with function pointers
and forward-declaration). After all, there's only so much you can do to simulate
inheritance in a language that doesn't know such concepts natively.

This said, I do see a very strong similarity in the code's interface and - more
importantly - smaller details like the counter. The question whether interfaces
are actually even copyrightable or not is a pretty heated one (c.f., Google vs.
Oracle), so I'm wary of taking that into account too much. With all the other
details, though, it does sound quite unlikely that this is just another, very
similar reimplementation of the interface they already used in the Seafile
server code.


> It is unclear to me if the Seafile server is part of Debian or if it is
> downloaded separately or during the install process and that Debian is only
> distributing the client part of Seafile.

Now on to the good news. Debian has so far neither shipped the client nor the
server in any proper release. The Seafile client is part of buster (current
testing branch, although frozen and expected to be released soonish),
stretch-backports (an optional repository) and unstable/sid.

The timing is good. I'm not a Debian maintainer/DD, but this sounds like
something worthy of a release critical status that may result in the packages
being evicted from the distribution BEFORE they are packaged as part of a proper
release.

The other good news is that until now, only the client is part of Debian, which,
as you have also mentioned, should not be affected by that issue.


> If the latter is the case, I still hope that Debian will make a stand and not
> distribute Seafile packages as long as there are copyright concerns
> associated with the Seafile Software.

Again, please file a bug report. In the worst case, it'll just cause the
maintainer a bit of bureaucratic work and be dismissed.

It luckily doesn't sound like the issue *actually* affects any packages in the
Debian archive, but it generally shows upstream's questionable copyright and
license handling. Personally, I'd feel bad maintaining a package that may end up
being problematic if audited (since... what other surprises might be lingering
in the client?) Trust is a factor, after all. I'd rather remove an untrustworthy
package than end up with a surprise. But that's just my very own, personal opinion.



Mihai


signature.asc (916 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Copyright concerns regarding Seafile

Jan-Henrik Haukeland

> On 13 May 2019, at 18:28, Mihai Moldovan <[hidden email]> wrote:
>
> Note that this list does not have any legal leverage, however.
*
> In case of license violations, the proper procedure is to file a bug report
> against the source package(s) in question. Package maintainers will handle that
> and request package removal by ftpmaster - the latter of which have the final
> say in what the archive is made up of (as far as I know).

Thanks for describing the proper way report this. I’ve gone ahead and filed bug-report. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928975

Jan-Henrik
Reply | Threaded
Open this post in threaded view
|

Re: Copyright concerns regarding Seafile

Moritz Schlarb-2
In reply to this post by Jan-Henrik Haukeland
Dear all,

as maintainer of the Seafile client packages (libsearpc, seafile and
seafile-client), I would like to thank Jan-Henrik for bringing this to
our attention.

There have already been such findings in the past, regarding some code
taken from git, and the discussion regarding libzdb in the past, as you
mentioned. I remember discussing the problems regarding linking to
OpenSSL, too.

However, all of the database related code is *only* contained in the
Seafile server implementation (https://github.com/haiwen/seafile-server,
RFP at #865830) and not in the Seafile client implementation
(https://github.com/haiwen/seafile) that I have packaged for Debian.

I disagree that this should serve as a reason for *not* including the
client packages in the next Debian release.

What do others think about that?

I will however forward these findings to the developers at Seafile Ltd
and ask them for a proper resolution.

Best regards,
--
Moritz Schlarb

schlarbm.vcf (396 bytes) Download Attachment
signature.asc (916 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Copyright concerns regarding Seafile

Andrej Shadura-2
Hi,

On Wed, 15 May 2019 at 12:10, Moritz Schlarb <[hidden email]> wrote:
> However, all of the database related code is *only* contained in the
> Seafile server implementation (https://github.com/haiwen/seafile-server,
> RFP at #865830) and not in the Seafile client implementation
> (https://github.com/haiwen/seafile) that I have packaged for Debian.
>
> I disagree that this should serve as a reason for *not* including the
> client packages in the next Debian release.

I fully agree. Since the client doesn’t include the code in question,
it’s out of scope of the issue, so there is no reason to remove it
from Debian.

--
Cheers,
  Andrej

Reply | Threaded
Open this post in threaded view
|

Re: Copyright concerns regarding Seafile

Ian Jackson-2
In reply to this post by Jan-Henrik Haukeland
Jan-Henrik Haukeland writes ("Copyright concerns regarding Seafile"):
> Libzdb is licensed under GPLv3. Copying and modifying GPL code is
> perfectly fine as long as the original copyright notice and license
> are kept. Unfortunately, this is not what the Seafile team
> did. Instead they copied code from libzdb, removed the copyright
> notice, claimed the code as their own and re-license it under
> another license.

That is very clearly Not OK.

However, I am puzzled by something.  AFAICT from github seafile-server
claims to be AGPL3-only.  You are talking about a licence conflict
with libzdb which is GPL3+.

But GPL3+ and AGPL3 are compatible.  So why did the seafile developers
feel the need to engage in this subterfuge ?

Ian.

Reply | Threaded
Open this post in threaded view
|

Re: Copyright concerns regarding Seafile

Ian Jackson-2
In reply to this post by Andrej Shadura-2
Andrej Shadura writes ("Re: Copyright concerns regarding Seafile"):
> On Wed, 15 May 2019 at 12:10, Moritz Schlarb <[hidden email]> wrote:
> I fully agree. Since the client doesn’t include the code in question,
> it’s out of scope of the issue, so there is no reason to remove it
> >from Debian.

I am very uncomfortable with having code in Debian whose upstream
authors appear to have plagiarised some other people's software, and
then obfuscated it, in order to evade copyright licensing.  Who knows
what other misleading practices they have engaged in, or may do in the
future ?

As a project, we do not have the resources to fully audit all the code
we ingest from upstreams and redistribute to our users.  We must rely
on trust.  That depends on the upstream being trustworthy.

Ian.

--
Ian Jackson <[hidden email]>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

Reply | Threaded
Open this post in threaded view
|

Re: Copyright concerns regarding Seafile

Jan-Henrik Haukeland
In reply to this post by Ian Jackson-2

> On 30 May 2019, at 14:48, Ian Jackson <[hidden email]> wrote:
>
> However, I am puzzled by something.  AFAICT from github seafile-server
> claims to be AGPL3-only.  You are talking about a licence conflict
> with libzdb which is GPL3+.
>
> But GPL3+ and AGPL3 are compatible.  So why did the seafile developers
> feel the need to engage in this subterfuge ?

I think it started with this issue at Github [1]. The Seafile team had copied code from Git which made them (reluctant) to have to license Seafile under GPLv2. This caused a problem with using libzdb which is only GPLv3. Another point is that Seafile has a proprietary professional edition server based upon the open source server code [2]. I think this was the motivation for trying to remove libzdb, and the Git code from seafile-server.

As far as I know there doesn't exist a MIT/BSD licensed C library with libzdb’s features, which they could use instead. So copying the libzdb code with some shuffling and some obfuscation was probably seen as a good idea.

Jan-Henrik

1. https://github.com/haiwen/seafile/issues/666
2. https://www.seafile.com/en/product/private_server/