Using Cloudflare as CDN for debian.org website

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Cloudflare as CDN for debian.org website

Bagas Sanjaya

Dear debian.org webmasters,

CDN (Content Delivery Network) is a service that distribute website's content into different servers across the world. This way, when a visitor in Singapore (for example) visit a website which hosted in Europe, the website assets will be served from CDN server in Singapore or nearby instead.

The benefits of using CDN are (quoting from wpbeginner.com):

  • speed
  • crash resistence
  • UX (user experience) and SEO improvements

Cloudflare is one of leading CDN provider in the world. Beside providing CDN service, it have DDOS protection which can be handy regarding security.

Is debian.org website currently use CDN? Either answer, is Cloudflare CDN suitable for debian.org website?

One of interesting feature of Cloudflare is reCAPTCHA challenge every time visitor access website which have this feature enabled. Before a visitor can reach website, there will be reCAPTCHA challenge which visitor must answer in order to get access to website's web property. Usually, visitor must click "I'm not a robot" checkbox and must select squares which contain a particular object (in most cases related to road infrastructure in USA). In order prevent the challenge, Cloudflare says in the challenge page:

If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.

If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.

Another way to prevent getting this page in the future is to use Privacy Pass browser extension

However, in the challenge page the visitor's IP address is also displayed. Is it OK for debian.org website to use reCAPTCHA challenge feature (which is available in Pro plan and higher) when using Cloudflare as CDN?

Regards, Bagas

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Wouter Verhelst
On Sat, Jun 15, 2019 at 07:14:58PM +0700, Bagas Sanjaya wrote:
> Dear debian.org webmasters,
>
> CDN (Content Delivery Network) is a service that distribute website's content
> into different servers across the world. This way, when a visitor in Singapore
> (for example) visit a website which hosted in Europe, the website assets will
> be served from CDN server in Singapore or nearby instead.

The Debian.org website is already mirrored to several machines, which in fact
implements a CDN avant la lettre.

[...]
> One of interesting feature of Cloudflare is reCAPTCHA challenge every time
> visitor access website which have this feature enabled.

Please let's not go there...

--
To the thief who stole my anti-depressants: I hope you're happy

  -- seen somewhere on the Internet on a photo of a billboard

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Bagas Sanjaya


On 15/06/19 23.03, Wouter Verhelst wrote:
On Sat, Jun 15, 2019 at 07:14:58PM +0700, Bagas Sanjaya wrote:
Dear debian.org webmasters,

CDN (Content Delivery Network) is a service that distribute website's content
into different servers across the world. This way, when a visitor in Singapore
(for example) visit a website which hosted in Europe, the website assets will
be served from CDN server in Singapore or nearby instead.
The Debian.org website is already mirrored to several machines, which in fact
implements a CDN avant la lettre.

[...]
One of interesting feature of Cloudflare is reCAPTCHA challenge every time
visitor access website which have this feature enabled.
Please let's not go there...

The Debian.org website is already mirrored to several machines, which in fact
implements a CDN avant la lettre.

Do debian.org website's CDN have DDOS protection like what Cloudflare have?

Please let's not go there...

What is your position regarding Cloudflare's reCAPTCHA challenge feature I mentioned earlier? Here are pros and cons of the feature:

Pros:

  • You can assure that (almost) all visitors are legitimate (human), since they must pass the challenge.

Cons:

  • The reCAPTCHA itself is trickier to pass.
  • Cloudflare's recommendation to prevent the Challenge can be difficult or impossible to implement. In case of office/shared networks, they have to contact network administrator in order to do scan across their network for infected/misconfigured devices, which can take long time.
  • Also, Cloudflare endorse Firefox by the statement "Another way to prevent getting this page in the future is to use Privacy Pass browser extension". This can force users of Chrome and other browsers to switch to Firefox only to get passed the challenge.
  • Displaying visitor's IP address in the challenge page is disrespectful to their privacy and can cause data leakage.
Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Steve McIntyre
On Sun, Jun 16, 2019 at 05:50:33AM +0700, Bagas Sanjaya wrote:
>
>What is your position regarding Cloudflare's reCAPTCHA challenge feature I
>mentioned earlier? Here are pros and cons of the feature:

Why on earth would we want to make it harder for people to read the
website?

>Pros:
>
>  • You can assure that (almost) all visitors are legitimate (human), since
>    they must pass the challenge.

Why do you think that's a feature we care about?

--
Steve McIntyre, Cambridge, UK.                                [hidden email]
< Aardvark> I dislike C++ to start with. C++11 just seems to be
            handing rope-creating factories for users to hang multiple
            instances of themselves.

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Bagas Sanjaya


On 16/06/19 06.26, Steve McIntyre wrote:
On Sun, Jun 16, 2019 at 05:50:33AM +0700, Bagas Sanjaya wrote:
What is your position regarding Cloudflare's reCAPTCHA challenge feature I
mentioned earlier? Here are pros and cons of the feature:
Why on earth would we want to make it harder for people to read the
website?

Pros:

 • You can assure that (almost) all visitors are legitimate (human), since
   they must pass the challenge.
Why do you think that's a feature we care about?

In fact, several websites (such as Capezio, ProMods Map, and WHEELS.ca) have reCAPTCHA challenge enabled. Those sites just want to make sure that all visitors are humans. Feel free to visit those sites above for your consideration.

Why on earth would we want to make it harder for people to read the
website?

Because the website with reCAPTCHA challenge feature may violate "secure it without overdoing it" principle, that is, their webmasters tighten website security so much as to make their website harder to access.

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Carsten Schoenert
Am 16.06.19 um 01:59 schrieb Bagas Sanjaya:

>>> Pros:
>>>
>>>   • You can assure that (almost) all visitors are legitimate (human), since
>>>     they must pass the challenge.
>> Why do you think that's a feature we care about?
>>
> In fact, several websites (such as Capezio <capezio.com>, ProMods Map
> <promods.net>, and WHEELS.ca <wheels.ca>) have reCAPTCHA challenge
> enabled. Those sites just want to make sure that all visitors are
> humans. Feel free to visit those sites above for your consideration.

That is not really answering the question from Steve. At least I can see
no point that is in interest for Debian.

Of course Cloudflare has reasons why people should use their service. I
see just no real reason why Debian should do.

Have we a problem that users can't reach the Debian websites on www.d.o?

  Not as far I know.

Have we a problem that spammers post their stuff somewhere to www.d.o?

  Also not as it's basically a set of static HTML sites and some CGI
  based search services.

Have we to mind that SEO is a critical thing for Debian.

  I don't think so. Yes we can improve some things SEO related, but it's
  not bad as hell. And SEO is related how you organize the websites, not
  related to CDN.

>> Why on earth would we want to make it harder for people to read the
>> website?
>
> Because the website with reCAPTCHA challenge feature may violate "secure
> it without overdoing it" principle, that is, their webmasters tighten
> website security so much as to make their website harder to access.

This is not real problem for the Debian websites as they are really
informational and basically text only. We have no dynamical content we
need to protect somehow on www.d.o.

It makes no sense to use external services if there is no need to.

--
Regards
Carsten Schoenert

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Rich Kulawiec
In reply to this post by Bagas Sanjaya
On Sat, Jun 15, 2019 at 07:14:58PM +0700, Bagas Sanjaya wrote:
> Cloudflare is one of leading CDN provider in the world.

Cloudflare is also one of the leading hosts for abusers: they're scumbags
who actively support spam, phishing, and much worse.

> One of interesting feature of Cloudflare is reCAPTCHA challenge every time

Captchas were quite thoroughly defeated years ago.  They're only used now
by ignorant newbies who haven't been paying attention and don't know any better.

--rsk

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Tomas Pospisek ML
In reply to this post by Bagas Sanjaya
Am 16.06.19 um 00:50 schrieb Bagas Sanjaya:
> Here are pros and cons of the feature:
> [..]

Cons:
* you are forcing your users to spend their limited humain life time
training Google's artificial intelligence.
* reCaptcha's dataset is closed source. Google can and will use it as
they wish. They can also use the dataset/the AI against humans or
humanity itself.
* it's publicly documented that Google was participating in US military
drone AI programs [1]. I believe at this point it is not possible to
refute from public sources that the reCaptcha dataset is not *already*
being used to kill people (?).

*t

[1]
https://theintercept.com/2018/05/31/google-leaked-emails-drone-ai-pentagon-lucrative/

Reply | Threaded
Open this post in threaded view
|

Re: Re: Using Cloudflare as CDN for debian.org website

Bagas Sanjaya
In reply to this post by Carsten Schoenert

Have we a problem that users can't reach the Debian websites on www.d.o?

  Not as far I know.
Having security check before users can access debian.org (www.d.o), in this case by reCAPTCHA, can also present problem: if a user can't pass the check, she can quickly leave www.d.o and she might tell others to avoid Debian due to difficulty to access www.d.o, thus we lose her.

For someone who explore www.d.o just for if Debian suit her, giving her only temporary access to www.d.o's web property can affect her: she might to answer reCAPTCHA again if the grace period to access www.d.o is up.

BTW, what is web property in the case above?

Have we to mind that SEO is a critical thing for Debian.

  I don't think so. Yes we can improve some things SEO related, but it's
  not bad as hell. And SEO is related how you organize the websites, not
  related to CDN.
In case of Capezio, the site is on #1 if we googled Capezio. I haven't checked the site yet, but the site maybe well organized.

Performance and security is crucial to SEO. Search engines tend to rank higher for well-performant and secure websites.


Consider the case when someone in his office/shared network tried to access websites which have reCAPTCHA challenge enabled. When he encountered the challenge page, he asked his network administrator to perform a scan across his entire network looking for infected and/or misconfigured devices. If the network administrator fulfill his request, there will be presumably long downtime due to device scanning. If not, he will always to complete reCAPTCHA everytime he access the website.

There is also consequence of this feature: If the user browse a website under Chrome and she encountered reCAPTCHA challenge, she may migrated to Firefox just to get Privacy Pass Extension in order to get rid of the challenge. If all Chrome/Chromium users do the same, Firefox will be the most used browser, thanks to Cloudflare's endorsement.

Reply | Threaded
Open this post in threaded view
|

Re: Re: Using Cloudflare as CDN for debian.org website

Bagas Sanjaya
In reply to this post by Tomas Pospisek ML

* reCaptcha's dataset is closed source. Google can and will use it as
they wish. They can also use the dataset/the AI against humans or
humanity itself.
AFAIK, reCAPTCHA's dataset mostly based on road infrastructure in USA (streets, buildings, cars, signs, etc). Those who lived in USA will easily recognize them, but what is the case for someone who can't master English well and doesn't know about road infrastructure used? He may leave the website just because he can't pass the security check.

Reply | Threaded
Open this post in threaded view
|

Re: Re: Using Cloudflare as CDN for debian.org website

Bagas Sanjaya
In reply to this post by Carsten Schoenert

That is not really answering the question from Steve. At least I can see
no point that is in interest for Debian.
Carsten and Steve, we're talking about performance and security of debian.org if we use Cloudflare CDN, especially regarding to reCAPTCHA challenge page.

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Carsten Schoenert
Am 17.06.19 um 14:43 schrieb Bagas Sanjaya:
>> That is not really answering the question from Steve. At least I can see
>> no point that is in interest for Debian.
> Carsten and Steve, we're talking about performance and security of
> debian.org if we use Cloudflare CDN, especially regarding to reCAPTCHA
> challenge page.

Well, no matter how long I think about, I see absolutely no gain and
real big Pros if we would start to use Cloudflares services. It would
solve simply no real problems we currently have. But it would bring us
one ore more new external dependencies and especially we would lose the
control over some some data.

--
Regards
Carsten Schoenert

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

Christian Kastner-3
In reply to this post by Bagas Sanjaya

On 2019-06-17 14:43, Bagas Sanjaya wrote:
> Carsten and Steve, we're talking about performance
It has been pointed out that performance is currently not an issue (see
Wouter's reply for details).

> and security of debian.org

With "security", you seem to mean "ensuring that a visitor is human". It
has been pointed out that there is currently no desire for that.

> if we use Cloudflare CDN, especially regarding to reCAPTCHA challenge page

Quoting Wikipedia [1]:

> The reCAPTCHA code is also heavily obfuscated and reverse-engineering
> attempts demonstrated that it collects enormous amounts of personal
> data, in line with Google user tracking and fingerprinting practices.
> Usage of reCAPTCHA, since acquisition of Google, is subject to Google's
> general privacy policy, which essentially requires the user to consent
> to collection of vast amounts of personal data in order to use websites
> protected by reCAPTCHA.

Given that
 * the one Pro you mentioned is currently of no value to Debian, and
 * the N Cons mentioned in this thread would come at a great cost to
   Debian and especially the visitors of Debian.org,
it should be evident why reCAPTCHA is out of the question at the moment.

[1] https://en.wikipedia.org/wiki/ReCAPTCHA#Criticism

Regards,
Christian

Reply | Threaded
Open this post in threaded view
|

Re: Re: Using Cloudflare as CDN for debian.org website

Wouter Verhelst
In reply to this post by Bagas Sanjaya
On Mon, Jun 17, 2019 at 03:03:47PM +0700, Bagas Sanjaya wrote:
>     * reCaptcha's dataset is closed source. Google can and will use it as
>     they wish. They can also use the dataset/the AI against humans or
>     humanity itself.
>
> AFAIK, reCAPTCHA's dataset mostly based on road infrastructure in USA (streets,
> buildings, cars, signs, etc). Those who lived in USA will easily recognize
> them, but what is the case for someone who can't master English well and
> doesn't know about road infrastructure used? He may leave the website just
> because he can't pass the security check.

How on earth do you consider that to even remotely be an advantage? We
*want* the website to be usable to everyone, not just to human beings.

--
To the thief who stole my anti-depressants: I hope you're happy

  -- seen somewhere on the Internet on a photo of a billboard

Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Using Cloudflare as CDN for debian.org website

Bagas Sanjaya

How on earth do you consider that to even remotely be an advantage? We
*want* the website to be usable to everyone, not just to human beings.
So, if we want to make debian.org usable and accessible to everyone, even allowing bots (search engine indexers and web archivers), while still being benefit from Cloudflare CDN, don't use reCAPTCHA challenge page. Free tier from Cloudflare will be enough, since the tier comes without the challenge page.

Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Using Cloudflare as CDN for debian.org website

Joerg Jaspert
On 15439 March 1977, Bagas Sanjaya wrote:

>> How on earth do you consider that to even remotely be an advantage? We
>> *want* the website to be usable to everyone, not just to human beings.
> So, if we want to make debian.org usable and accessible to everyone,
> even allowing bots (search engine indexers and web archivers), while
> still being benefit from Cloudflare CDN, don't use reCAPTCHA challenge
> page. Free tier from Cloudflare will be enough, since the tier comes
> without the challenge page.

And still, a CDN like cloudflare gives us nothing that we need, and
doesn't solve the existing problems our web part has, so why should we
switch to it?

--
bye, Joerg

Reply | Threaded
Open this post in threaded view
|

Re: Using Cloudflare as CDN for debian.org website

MJ Ray-2
In reply to this post by Bagas Sanjaya
On Sat, 15 Jun 2019 19:14:58 +0700
Bagas Sanjaya <[hidden email]> wrote:

> [...] Is it OK for debian.org website to use reCAPTCHA challenge
> feature (which is available in Pro plan and higher) when using
> Cloudflare as CDN?

It's never OK for anyone to use recaptcha because it discriminates
against people with disabilities. If I never have to endure one of
those audiovisual tests again to gain access to something, I would be
very very happy. I already know my eyesight and hearing are defective,
thank you very much. I use technology such as unusual display filters
and colour identifier apps to overcome that and that should not result
in me being insulted by a computer as not qualifying as human.

Hope that explains,
--

MJR http://mjr.towers.org.uk/
Member of http://www.software.coop/ (but this email is my personal view
only)