Traffic shaping on debian

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Traffic shaping on debian

Aleksey-2
Hi guys!

I have a debian box acting as a router and need a tool to perform
traffic shaping based on source/destination IPs, interfaces, etc. I have
tried the default tc, however, it uses plenty of resources, e.g. 600
mbps without shaping flows through with 3% cpu load and the same 600mbps
with shaping (tc using htb on egress interface) consumes something like
40% cpu.

Probably someone could advise some kind of a tool to do such shaping
with minimum resources consumed - I've searched through the web and
found a module named nf-hishape, however, I didn't manage to find some
reasonably high number of articles about it as well as no manuals and so
on - I guess it's not very popular (if it's actually alive).

Any help would be appreciated.

Thanks in advance.

--
With kind regards,
Aleksey

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Mihamina Rakotomandimby-8

On 05/27/2016 02:40 PM, Aleksey wrote:
> Hi guys!
>
> I have a debian box acting as a router and need a tool to perform
> traffic shaping based on source/destination IPs, interfaces, etc. I
> have tried the default tc, however, it uses plenty of resources, e.g.
> 600 mbps without shaping flows through with 3% cpu load and the same
> 600mbps with shaping (tc using htb on egress interface) consumes
> something like 40% cpu.
>


Would you share your configuration on some snippets?
I mean, the script you use to invoke tc an so on...

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Dmitry Sinina
In reply to this post by Aleksey-2
On 05/27/2016 02:40 PM, Aleksey wrote:

> Hi guys!
>
> I have a debian box acting as a router and need a tool to perform traffic shaping based on source/destination IPs, interfaces, etc. I have tried the default tc, however, it uses plenty of resources, e.g. 600 mbps without shaping flows through with 3% cpu load and the same 600mbps with shaping (tc
> using htb on egress interface) consumes something like 40% cpu.
>
> Probably someone could advise some kind of a tool to do such shaping with minimum resources consumed - I've searched through the web and found a module named nf-hishape, however, I didn't manage to find some reasonably high number of articles about it as well as no manuals and so on - I guess it's
> not very popular (if it's actually alive).
>
> Any help would be appreciated.
>
> Thanks in advance.
>
Hi.

Seems you use flat list of filters. How many filters you have?
Did you try hash tables for traffic classification?

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Aleksey-2
On 2016-05-27 14:48, Dmitry Sinina wrote:

> On 05/27/2016 02:40 PM, Aleksey wrote:
>> Hi guys!
>>
>> I have a debian box acting as a router and need a tool to perform
>> traffic shaping based on source/destination IPs, interfaces, etc. I
>> have tried the default tc, however, it uses plenty of resources, e.g.
>> 600 mbps without shaping flows through with 3% cpu load and the same
>> 600mbps with shaping (tc
>> using htb on egress interface) consumes something like 40% cpu.
>>
>> Probably someone could advise some kind of a tool to do such shaping
>> with minimum resources consumed - I've searched through the web and
>> found a module named nf-hishape, however, I didn't manage to find some
>> reasonably high number of articles about it as well as no manuals and
>> so on - I guess it's
>> not very popular (if it's actually alive).
>>
>> Any help would be appreciated.
>>
>> Thanks in advance.
>>
> Hi.
>
> Seems you use flat list of filters. How many filters you have?
> Did you try hash tables for traffic classification?

Hi.

Practically, I haven't done any configuration on my production router -
I have performed tests in lab environment. Configuration was pretty
simple:

tc qdisc add dev eth1 root handle 1: htb default 30
tc class add dev eth1 parent 1: classid 1:1 htb rate 1000mbps ceil
1000mbps
tc class add dev eth1 parent 1:1 classid 1:10 htb rate 3mbps ceil 5mbps
tc class add dev eth1 parent 1:1 classid 1:20 htb rate 5mbps ceil 7mbps
tc class add dev eth1 parent 1:1 classid 1:30 htb rate 1mbps ceil
1000mbps
tc qdisc add dev eth1 parent 1:10 handle 10:0 sfq perturb 10
tc qdisc add dev eth1 parent 1:20 handle 20:0 sfq perturb 10
tc qdisc add dev eth1 parent 1:30 handle 30:0 sfq perturb 10
tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32  match ip dport
443 0xffff flowid 1:20
tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32  match ip dport
80 0xffff flowid 1:10

So after applying it I tried to push some traffic through this lab box
using iperf. When performing test on ports 80/443 (limited to low
bandwidth) - CPU load was ok, however when I pushed unrestricted traffic
(1000 mbps limit) I noticed high CPU usage. I tried setting up filters
based on fwmark but the result was the same. I'm using debian 7 with
3.16 kernel installed from wheezy-backports, if it is important.

If some additional info (firewall config, etc) is needed, please ask.

--
With kind regards,
Aleksey

lxP
Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

lxP
Hi,

On 2016-05-27 15:50, Aleksey wrote:
> tc class add dev eth1 parent 1:1 classid 1:30 htb rate 1mbps ceil 1000mbps

I have never measured the CPU usage, but I also noticed that htb ceil does not
perform for me as expected. I could never get through the full ceil bandwidth,
even if there is no other traffic.
I would suggest you to try using a higher htb rate (e.g. 988mbit) and rerun your
experiments.
I started to avoid htb ceil in general and switched to fair queuing qdiscs like
fq_codel, drr, sfq and so on when ever possible. However, that might not
directly fit your needs.

Best regards,
lxP

lxP
Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

lxP
Since a few years, I am searching for a good solution for services of different
priority, so I am really interested in how you would do that.

On 2016-05-28 09:49, Ruben Wisniewski wrote:

> fq_codel should be used in any case, if you have more than one service use a
> fq_codel per service and shape them accordingly with a hard limit. Above all
> services queues add a root fq_codel and shape them to 92% of the total available
> bandwidth in total.
>
> Reducing a service under the physical bandwidth needed is mostly unintended or
> is a result of misunderstanding of the difference between  average to peak
> bandwidth of network-applications and cost me in job a huge amount of time.
>
> A good start point:
> https://wiki.gentoo.org/wiki/Traffic_shaping
>
>
> Best regards Ruben

I didn't fully understand what qdisc hierarchy you are suggesting. It sounds for
me that you add fq_codel below an fq_codel queue, which is impossible, or?

htb (hard limit)
- ?
-- fq_codel (service A)
-- fq_codel (service B)
-- fq_codel (service C)

You could remove the question mark entirely and put the fq_codel queues directly
below htb with a fixed hard limit.
However, in most cases you don't want a hard limit, but just specify a rough
priority between the services. If the link is unsaturated, each service should
be able to use arbitrary bandwidth. However, if it approaches saturation the
services should approach a specified percentage of the total bandwidth.
The obvious solution would be to use "ceil", but as mentioned before it doesn't
perform well.
You could put drr for the question mark, which performs well, but it would just
handle each service of equal priority.
You could put prio for the question mark, but if service A uses all bandwidth,
service B and C will starve.
Does anyone have a solution for that problem?

Best regards,
lxP

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Martin Kraus-3
In reply to this post by Aleksey-2
On Fri, May 27, 2016 at 04:50:55PM +0300, Aleksey wrote:

> Practically, I haven't done any configuration on my production router - I
> have performed tests in lab environment. Configuration was pretty simple:
>
> tc qdisc add dev eth1 root handle 1: htb default 30
> tc class add dev eth1 parent 1: classid 1:1 htb rate 1000mbps ceil 1000mbps
> tc class add dev eth1 parent 1:1 classid 1:10 htb rate 3mbps ceil 5mbps
> tc class add dev eth1 parent 1:1 classid 1:20 htb rate 5mbps ceil 7mbps
> tc class add dev eth1 parent 1:1 classid 1:30 htb rate 1mbps ceil 1000mbps
> tc qdisc add dev eth1 parent 1:10 handle 10:0 sfq perturb 10
> tc qdisc add dev eth1 parent 1:20 handle 20:0 sfq perturb 10
> tc qdisc add dev eth1 parent 1:30 handle 30:0 sfq perturb 10
> tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32  match ip dport 443
> 0xffff flowid 1:20
> tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32  match ip dport 80
> 0xffff flowid 1:10

I'd assume the problem is that when you bind htb directly to the root of a
device you basically loose the multiqueue capability of an ethernet card
because all packets must end in a single queue from which they are dispatched
to the multiple queues of an ethernet card.
mk

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Aleksey-2
On 2016-05-28 18:16, Martin Kraus wrote:

> On Fri, May 27, 2016 at 04:50:55PM +0300, Aleksey wrote:
>> Practically, I haven't done any configuration on my production router
>> - I
>> have performed tests in lab environment. Configuration was pretty
>> simple:
>>
>> tc qdisc add dev eth1 root handle 1: htb default 30
>> tc class add dev eth1 parent 1: classid 1:1 htb rate 1000mbps ceil
>> 1000mbps
>> tc class add dev eth1 parent 1:1 classid 1:10 htb rate 3mbps ceil
>> 5mbps
>> tc class add dev eth1 parent 1:1 classid 1:20 htb rate 5mbps ceil
>> 7mbps
>> tc class add dev eth1 parent 1:1 classid 1:30 htb rate 1mbps ceil
>> 1000mbps
>> tc qdisc add dev eth1 parent 1:10 handle 10:0 sfq perturb 10
>> tc qdisc add dev eth1 parent 1:20 handle 20:0 sfq perturb 10
>> tc qdisc add dev eth1 parent 1:30 handle 30:0 sfq perturb 10
>> tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32  match ip
>> dport 443
>> 0xffff flowid 1:20
>> tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32  match ip
>> dport 80
>> 0xffff flowid 1:10
>
> I'd assume the problem is that when you bind htb directly to the root
> of a
> device you basically loose the multiqueue capability of an ethernet
> card
> because all packets must end in a single queue from which they are
> dispatched
> to the multiple queues of an ethernet card.
> mk

Hi.

I have also noticed that all the load is on one CPU core it is not
distributed to all available cores. And how can this be avoided?


to lxP:

I'll try to rerun tests as you said and will report the results.

--
With kind regards,
Aleksey

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Martin Kraus-3
On Mon, May 30, 2016 at 01:55:51PM +0300, Aleksey wrote:
> I have also noticed that all the load is on one CPU core it is not
> distributed to all available cores. And how can this be avoided?

There is a qdisc called mq which creates a class for each hardware queue on
the attached ethernet card. You can bind other qdiscs (such as htb) to each of
these classes but this will not allow you to shape traffic for a single
type going out over all the hardware queues.

It might be possible to have multiple htb qdiscs and use filters to send
each type of traffic to a selected hardware queue. This has other adverse
effects (such as not being able to borrow unused bandwidth among the hw
queues) and there still might be lock contention among the cores for each such
queue so it might not even work better.

If you are at 1 Gbit speed the cpu can probably handle it so there is no need
to do any of this. If you have a 10Gbit+ connection then this probably isn't
the correct place to do shaping anyway and should be done closer to the source.

It depends on what you're trying to accomplish.

regards
Martin

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Aleksey-2
On 2016-05-30 18:34, Martin Kraus wrote:

> On Mon, May 30, 2016 at 01:55:51PM +0300, Aleksey wrote:
>> I have also noticed that all the load is on one CPU core it is not
>> distributed to all available cores. And how can this be avoided?
>
> There is a qdisc called mq which creates a class for each hardware
> queue on
> the attached ethernet card. You can bind other qdiscs (such as htb) to
> each of
> these classes but this will not allow you to shape traffic for a single
> type going out over all the hardware queues.
>
> It might be possible to have multiple htb qdiscs and use filters to
> send
> each type of traffic to a selected hardware queue. This has other
> adverse
> effects (such as not being able to borrow unused bandwidth among the hw
> queues) and there still might be lock contention among the cores for
> each such
> queue so it might not even work better.
>
> If you are at 1 Gbit speed the cpu can probably handle it so there is
> no need
> to do any of this. If you have a 10Gbit+ connection then this probably
> isn't
> the correct place to do shaping anyway and should be done closer to the
> source.
>
> It depends on what you're trying to accomplish.
>
> regards
> Martin

So, yes, I have 10G uplinks. The main goal is to be able to shape
traffic from certain hosts to the destinations that are reachable
through local internet exchange and to all other destinations (world).
Local IX is connected to one interface of my debian box and worldwide
traffic flows through the another. The simpliest way to achieve this,
for my opinion, was to apply egress qdiscs on there interfaces and apply
filters and classes there also, so it would effectively shape as I need.
The problem with shaping closer to the source is that I wouldn't be able
to classify the traffic on switches - it's not only one or a couple of
destinations, it's something like 30k destinations available through
local IX.

Probably you could point me to a better option.

P.S. to lxP - increasing rate on the default htb class didn't help -
probably, CPU usage could drop a couple percents lower (not sure,
really) but is is definitely not significant.

--
With kind regards,
Aleksey

Reply | Threaded
Open this post in threaded view
|

Re: Traffic shaping on debian

Raúl Alexis Betancor Santana
 

> So, yes, I have 10G uplinks. The main goal is to be able to shape
> traffic from certain hosts to the destinations that are reachable
> through local internet exchange and to all other destinations (world).
> Local IX is connected to one interface of my debian box and worldwide
> traffic flows through the another. The simpliest way to achieve this,
> for my opinion, was to apply egress qdiscs on there interfaces and apply
> filters and classes there also, so it would effectively shape as I need.
> The problem with shaping closer to the source is that I wouldn't be able
> to classify the traffic on switches - it's not only one or a couple of
> destinations, it's something like 30k destinations available through
> local IX.
>
> Probably you could point me to a better option.
>
> P.S. to lxP - increasing rate on the default htb class didn't help -
> probably, CPU usage could drop a couple percents lower (not sure,
> really) but is is definitely not significant.
>
> --
> With kind regards,
> Aleksey

If you are trying to shape 10G links with an Debian box, apart from routing
... you will have to do lot tunning on the system.

Look for 'High performance Linux routing' on google, you will find lot of
articles explain the caveats you will face.

When going over 1G links ... it's better to use dedicated hardware for rouiting
and shaping, IMHO.

Best regards
(null)