Sarge SMTP Performance

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Sarge SMTP Performance

Lars Roland
Hi all

I just replaced my company anti-virus/anti-spam mail gateway from a
Redhat 7.3 with kernel 2.4.20-24.7smp to Debian Sarge with kernel
2.6.8.1-i686-SMP. I
had hoped that this transition would lead to better performance (new
version of Perl, better drivers in the kernel and so on) but the performance has
instead drooped about 30%.

Here is my setup.

Hardware:
IBM 335 with dual 2.4 Ghz Xeon, 1GB Ram and 1 10.000RPM SCSI disk.


Software:

Minimal Debian Sarge (that is I have turned all unnecessary services off).
Kernel 2.6.8.1-SMP.
Reiserfs on all partitions (except /boot).
Qmail MTA configured with 70 incoming connections.
ClamAV running as a daemon.
10 Spamassassin daemons (spamd)
Qmail-scanner.

On my old Redhat system the hardware could scan around 60.000 emails
pr. hour with an average scan time of 5.6 seconds (including time from
both ClamAV and Spamassassin) and average load of 23.7.

My new Sarge installation on the same hardware scans 40.000 emails pr
hour with an average scan time of 4.8 but with a load average of 57.8.

Interestingly if i time the internal handling of the email then Sarge
seams to win (the numbers below is calculated from 4 days of mail flow
(about 3.9 million emails))

1) Spam scanning is about 18% faster than the old Redhat system.
2) Perl handling of the email is about 12% faster.
3) ClamAv is scanning 8% faster.

on the down side Sarge gets beaten in the following categories.

1) Unpacking email and attachments with Ripmime and unpackers (unzip,
unrar...) - this procedure used to average 0.075 seconds on my old
Redhat system - now the average is about 2 seconds (note that this can
drop if I renice the parent process responsible for calling the
unpackers but then other things start to take up time - usually
spamassassin).

2) The number of connections that timeout on the SMTP service is 30%
higher than on the Redhat system

These numbers leads me to think that the system cannot handle as many
emails as before because it simply does not handle enough connections
(eg. the connections time out on the SMTP port before even getting to
the scanners) or because filesystem performance has dropped - To
persue this idea I have tried the folowing:

1) Change the file system to XFS, EXT3.
2) Running Reisefs with notail, nodiratime and noatime
3) Renice qmail-smtpd so it gets higher priority than spamd (hoping
that this would lead to more connections getting handled).
4) Change the I/O scheduler to deadline (elevator=deadline).
5) Changed the kernel to 2.4.27-i686-SMP.
6) Turning the firewall (iptables) completely off.
7) Tuning the TCP performance in accordance to the Linux TCP Tuning
Guide (http://www-didc.lbl.gov/TCP-tuning/linux.html).

Non of it has worked. And yes I do get 60.000 incoming connections pr.
hour most of them just seams to time out an get handled by the next MX
in my DNS.

Note that the DNS server I use is the same as the one used in the old
Redhat system and name resolution perform equally on both systems.

To see if the server could take the load on its own I have tried
changing my MX to only contain this one server. This made the load
jump to 98.9 and then the server eventually died with around 55
defunct perl process's floating around - my old Rehat server could
handle being the only mail server just fine (with loadavg around
28.5).

So as it is now I am a bit baffled by the slowness of Sarge, because
all the other systems I have converted to Sarge and kernel 2.6 have
run significantly faster  (Database servers, web servers, name
servers...).

So my question is this, does anyone know of any limitation in Sarge
(default values of incoming connections (not that I have ever heard of
such a thing)) that would cause my system to degrade in a way that it
has. When I do a telnet to port 25 I simply do not get a connection
fast enough (most of them times out) so this leads my to suspect that
something is wrong.

Another solution could of cause be that the drivers in kernel 2.6.8.1
is buggier than the old ones in Redhat kernel 2.4.20-24.7smp - I have
still not investigated this fully.



Regards.

Lars Roland

Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

Johann Botha
Hi Lars                                       >@2005.05.22_00:00:03_GMT+0200

> I just replaced my company anti-virus/anti-spam mail gateway from a
> Redhat 7.3 with kernel 2.4.20-24.7smp to Debian Sarge with kernel
> 2.6.8.1-i686-SMP. I
> had hoped that this transition would lead to better performance (new
> version of Perl, better drivers in the kernel and so on) but the performance has
> instead drooped about 30%.
>
> Here is my setup.
>
> Hardware:
> IBM 335 with dual 2.4 Ghz Xeon, 1GB Ram and 1 10.000RPM SCSI disk.
We had similar problems a while ago:

Try the latest 2.6 kernel
Try compiling with High Memory Support on and off
Try upgrade the Bios

in our case for some weird reason when we used High Memory the server was
slow, a bios upgrade solved our problem. (our server is not an ibm and not smp)

hth.

--
Regards
 Joe

  "I suggest a new strategy, R2: let the Wookie win."
     -- C3P0
_______________________________________________________
   frogfoot networks  +27.21.689.3867  www.frogfoot.com


signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

Adrian Minta-3
In reply to this post by Lars Roland
Make sure you have disabled IPv6. I guess you have to recompile the
kernel :(


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

tv (Bugzilla)-2
In reply to this post by Lars Roland
Lars Roland wrote:

> Hi all
>
> I just replaced my company anti-virus/anti-spam mail gateway from a
> Redhat 7.3 with kernel 2.4.20-24.7smp to Debian Sarge with kernel
> 2.6.8.1-i686-SMP. I
> had hoped that this transition would lead to better performance (new
> version of Perl, better drivers in the kernel and so on) but the performance has
> instead drooped about 30%.
>
> Here is my setup.
>
> Hardware:
> IBM 335 with dual 2.4 Ghz Xeon, 1GB Ram and 1 10.000RPM SCSI disk.
>
>
> Software:
>
> Minimal Debian Sarge (that is I have turned all unnecessary services off).
> Kernel 2.6.8.1-SMP.
> Reiserfs on all partitions (except /boot).
> Qmail MTA configured with 70 incoming connections.
> ClamAV running as a daemon.
> 10 Spamassassin daemons (spamd)
> Qmail-scanner.
> ...

Hi Lars,

have you installed the optimized libc6 package?

libc6-i686 - GNU C Library: Shared libraries [i686 optimized]

I don't know if it will give you any extra speed but maybe it is worth a
try.


Regards,

Timo


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

martin f krafft
In reply to this post by Adrian Minta-3
also sprach AdrianMinta <[hidden email]> [2005.05.22.0039 +0200]:
> Make sure you have disabled IPv6. I guess you have to recompile the
> kernel :(

Uh, how will that help?

--
Please do not send copies of list mail to me; I read the list!
 
 .''`.     martin f. krafft <[hidden email]>
: :'  :    proud Debian developer, admin, user, and author
`. `'`
  `-  Debian - when you have better things to do than fixing a system
 
Invalid/expired PGP subkeys? Use subkeys.pgp.net as keyserver!
 
the reason the mainstream is thought of as a stream
is because it is so shallow.

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

martin f krafft
In reply to this post by Lars Roland
also sprach Lars Roland <[hidden email]> [2005.05.22.0000 +0200]:
> On my old Redhat system the hardware could scan around 60.000
> emails pr. hour with an average scan time of 5.6 seconds
> (including time from both ClamAV and Spamassassin) and average
> load of 23.7.

spamassassin 2.x?

> My new Sarge installation on the same hardware scans 40.000 emails
> pr hour with an average scan time of 4.8 but with a load average
> of 57.8.

spamassassin 3.x?

AFAICT, spamassassin 3.x is (a) a lot better, and (b) a *lot* more
resource hungry. I think this would explain your problem. The other
stuff -- SMTP timeouts and slow disk access for unpacks -- are
probably direct consequences, though I may well be wrong.

Did you enable DMA? Check with hdparm and your drives. Or are you
using SCSI?

Maybe you can run bonnie++ on both systems and verify that the
harddisk access is not the bottleneck?

> So as it is now I am a bit baffled by the slowness of Sarge,

This is not sarge, this is a configuration problem somewhere. Even
though "sarge" did not get faster per se, the 2.6 kernel *does*
speed things up a lot.

Another thing I seem to remember from my qmail times is that qmail
and reiserfs did not get along well. You have tried other
filesystems, but all of them were journaled, and I think qmail
doesn't play well with those. Have you tried another MTA, like
postfix? I have administered postfix servers on about the same
hardware has you, taking as much as 100,000 mails per hour at peak
times. Ralf Hildebrandt has a postfix+ext3 howto, which may be
useful even to other MTAs.

--
Please do not send copies of list mail to me; I read the list!
 
 .''`.     martin f. krafft <[hidden email]>
: :'  :    proud Debian developer, admin, user, and author
`. `'`
  `-  Debian - when you have better things to do than fixing a system
 
Invalid/expired PGP subkeys? Use subkeys.pgp.net as keyserver!
 
"the good thing about standards is
 that there are so many to choose from."
                                                -- andrew s. tanenbaum

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

Maciej Matysiak
In reply to this post by Lars Roland
On the 22nd of May 2005 at 00:00, Lars Roland <lroland#gmail.com> wrote:

> Qmail MTA configured with 70 incoming connections.
> ClamAV running as a daemon.
> 10 Spamassassin daemons (spamd)
> Qmail-scanner.

Please replace qmail-scanner with simscan from http://inter7.com/simscan .
Qmail-scanner is written in perl and can kill busy server. When I switched
to simscan, load average on my machine dropped from ~20 to under 0.5.
Simscan has also useful features: rejecting viruses on smtp level, rejecting
spam with score over <number>. Both are configurable per domain at least.
The installation is very easy, the most common problem: make sure that
clamav runs as simscan user (write permissions) :)

 m.m.
--
 use gnus, not guns!


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

Lars Roland
In reply to this post by Lars Roland
On 5/22/05, Lars Roland <[hidden email]> wrote:
> Hi all
>
> I just replaced my company anti-virus/anti-spam mail gateway from a
> Redhat 7.3 with kernel 2.4.20-24.7smp to Debian Sarge with kernel
> 2.6.8.1-i686-SMP. I
> had hoped that this transition would lead to better performance (new
> version of Perl, better drivers in the kernel and so on) but the performance has
> instead drooped about 30%.

Well it seamed that at least part of the problem was related to the
new kernel. On my old Redhat system my Broadcom tg3 Ethernet card was
running 100mbit with a tx queue length of 100. On kernel 2.6.8.1 it
was running 1000Mbit with tx queue length of 1000.

Setting the length to 100 again using:

ifconfig eth0 txqueuelen 100

Made the load drop immediately and also harddisk performance improved: eg.

--------------------------------
txgueue length: 1000

hdparm -t /dev/sda

/dev/sda:
 Timing buffered disk reads:  12 MB in  3.40 seconds =  3.53 MB/sec
--------------------------------

and

--------------------------------
txgueue length: 100

hdparm -t /dev/sda

/dev/sda:
  Timing buffered disk reads:  152 MB in  3.40 seconds =  44.74 MB/sec
--------------------------------

So now the load looks ok, still the old Redhat is holding its head
above but now it is only with 10%. This may be due to further bugs in
the tg3 driver that hopefully a new kernel will fix - if not then I
must fill a bug report and send it to the driver developers, it can
not be intently that there driver messes up system performance that
much because of its default queue length.


Regards.

Lars Roland

Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance / hints about tg3/bcm5700

Andreas John
Hi!

> above but now it is only with 10%. This may be due to further bugs in
> the tg3 driver that hopefully a new kernel will fix - if not then I
> must fill a bug report and send it to the driver developers, it can

I would try the bcm5700 driver from broadcom's website, if installing a
sk98lin or e1000 is no option. They have an GPL'd Version, but please do
not open a discussion about firmware files which are only available in
binary form. This "BLOB" firmware was removed by Debian's Kernel masters
since 2.6.5 .. 7  .... or so (Note: The "vanilla" kernel form kernel.org
includes the BLOB!)

There is also a "bcm5700-source" .deb package in non-free (at least in
sid) but this is Version 7.3.5 while broadcom's upstream source is 8.x
since some weeks. Please also note that you can patch bcm5700 into the
kernel while keeping tg3, so you can play with both. I cannot tell you
anything about the non-GPL driver broadcom offers.

All this took me some time to figure out, so I take this opportunity to
post it, so the SE can index it.... ;)

rgds,
Andreas


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance / hints about tg3/bcm5700

Torsten Werner
Andreas John wrote:

>There is also a "bcm5700-source" .deb package in non-free (at least in
>sid) but this is Version 7.3.5 while broadcom's upstream source is 8.x
>since some weeks.
>

I have just uploaded version 8.1.55-1.

>Please also note that you can patch bcm5700 into the
>kernel while keeping tg3, so you can play with both.
>

That is not possible with the newer version except someone sends me the
patch.


Regards,
Torsten


--
To UNSUBSCRIBE, email to [hidden email]
with a subject of "unsubscribe". Trouble? Contact [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance / hints about tg3/bcm5700

Lars Roland
On 5/22/05, Torsten Werner <[hidden email]> wrote:
> Andreas John wrote:
>
> >There is also a "bcm5700-source" .deb package in non-free (at least in
> >sid) but this is Version 7.3.5 while broadcom's upstream source is 8.x
> >since some weeks.
> >
>
> I have just uploaded version 8.1.55-1.

I will try that one out - seams very likely that most of my trouble
will go away with another driver.


Regards

Lars Roland

Reply | Threaded
Open this post in threaded view
|

RE: Sarge SMTP Performance

Ali Onur Uyar (Gestion IT)
In reply to this post by Lars Roland
> -----Mensaje original-----
> De: martin f krafft [mailto:[hidden email]]
> Enviado el: Domingo, 22 de Mayo de 2005 04:48 a.m.
> Para: [hidden email]
> Asunto: Re: Sarge SMTP Performance
>
>
> also sprach AdrianMinta <[hidden email]>
> [2005.05.22.0039 +0200]:
> > Make sure you have disabled IPv6. I guess you have to recompile the
> > kernel :(
>
> Uh, how will that help?
>

I don't know if this has anything to do with the problem you are
experiencing at the moment, but I have observed that in systems with
IPv6 support, IPv6 enabled applications might fireout an additional AAAA
DNS query for each A query that times out.

Ali Onur Uyar
Administrador de Infraestructura
GestionIT


Reply | Threaded
Open this post in threaded view
|

Re: Sarge SMTP Performance

martin f krafft
also sprach Ali Onur Uyar (Gestion IT) <[hidden email]> [2005.05.23.1507 +0200]:
> I don't know if this has anything to do with the problem you are
> experiencing at the moment, but I have observed that in systems
> with IPv6 support, IPv6 enabled applications might fireout an
> additional AAAA DNS query for each A query that times out.

First, do not CC me on replies, that's against list policy.

And second: that extra DNS query is hardly going to make
a difference with a well-written application. And I do hope you are
using a (sensible) DNS cache on a mail server...

--
Please do not send copies of list mail to me; I read the list!
 
 .''`.     martin f. krafft <[hidden email]>
: :'  :    proud Debian developer, admin, user, and author
`. `'`
  `-  Debian - when you have better things to do than fixing a system
 
Invalid/expired PGP subkeys? Use subkeys.pgp.net as keyserver!
 
on the other hand, you have different fingers.

signature.asc (196 bytes) Download Attachment