lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9a8748490707081428p11a9b728m8cec7cc2a122d907@mail.gmail.com>
Date:	Sun, 8 Jul 2007 23:28:11 +0200
From:	"Jesper Juhl" <jesper.juhl@...il.com>
To:	knobi@...bisoft.de
Cc:	linux-kernel@...r.kernel.org
Subject: Re: Understanding I/O behaviour

On 05/07/07, Jesper Juhl <jesper.juhl@...il.com> wrote:
> On 05/07/07, Martin Knoblauch <spamtrap@...bisoft.de> wrote:
> > Hi,
> >
> >  for a customer we are operating a rackful of HP/DL380/G4 boxes that
> > have given us some problems with system responsiveness under [I/O
> > triggered] system load.
> >
> >  The systems in question have the following HW:
> >
> > 2x Intel/EM64T CPUs
> > 8GB memory
> > CCISS Raid controller with 4x72GB SCSI disks as RAID5
> > 2x BCM5704 NIC (using tg3)
> >
> >  The distribution is RHEL4. We have tested several kernels including
> > the original 2.6.9, 2.6.19.2, 2.6.22-rc7 and 2.6.22-rc7+cfs-v18.
> >
> >  One part of the workload is when several processes try to write 5 GB
> > each to the local filesystem (ext2->LVM->CCISS). When this happens, the
> > load goes up to 12 and responsiveness goes down. This means from one
> > moment to the next things like opening a ssh connection to the host in
> > question, or doing "df" take forever (minutes). Especially bad with the
> > vendor kernel, better (but not perfect) with 2.6.19 and 2.6.22-rc7.
> >
> >  The load basically comes from the writing processes and up to 12
> > "pdflush" threads all being in "D" state.
> >
> >  So, what I would like to understand is how we can maximize the
> > responsiveness of the system, while keeping disk throughput at maximum.
> >
>
> I'd suspect you can't get both at 100%.
>
> I'd guess you are probably using a 100Hz no-preempt kernel.  Have you
> tried a 1000Hz + preempt kernel?   Sure, you'll get a bit lower
> overall throughput, but interactive responsiveness should be better -
> if it is, then you could experiment with various combinations of
> CONFIG_PREEMPT, CONFIG_PREEMPT_VOLUNTARY, CONFIG_PREEMPT_NONE and
> CONFIG_HZ_1000, CONFIG_HZ_300, CONFIG_HZ_250, CONFIG_HZ_100 to see
> what gives you the best balance between throughput and interactive
> responsiveness (you could also throw CONFIG_PREEMPT_BKL and/or
> CONFIG_NO_HZ, but I don't think the impact will be as significant as
> with the other options, so to keep things simple I'd leave those out
> at first) .
>
> I'd guess that something like CONFIG_PREEMPT_VOLUNTARY + CONFIG_HZ_300
> would probably be a good compromise for you, but just to see if
> there's any effect at all, start out with CONFIG_PREEMPT +
> CONFIG_HZ_1000.
>

I'm currious, did you ever try playing around with CONFIG_PREEMPT* and
CONFIG_HZ* to see if that had any noticable impact on interactive
performance and stuff like logging into the box via ssh etc...?

-- 
Jesper Juhl <jesper.juhl@...il.com>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ