lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 7 Jul 2009 23:05:48 +0100
From:	Daniel J Blueman <daniel.blueman@...il.com>
To:	Chetan.Loke@...lex.com, matthew@....cx, andi@...stfloor.org,
	jens.axboe@...cle.com, Arjan van de Ven <arjan@...radead.org>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: >10% performance degradation since 2.6.18

On Mon, Jul 6, 2009 at 10:58 PM, <Chetan.Loke@...lex.com> wrote:
>> -----Original Message-----
>> From: linux-kernel-owner@...r.kernel.org
>> [mailto:linux-kernel-owner@...r.kernel.org] On Behalf Of
>> Daniel J Blueman
>> Sent: Sunday, July 05, 2009 7:01 AM
>> To: Matthew Wilcox; Andi Kleen
>> Cc: Linux Kernel; Jens Axboe; Arjan van de Ven
>> Subject: Re: >10% performance degradation since 2.6.18
>>
>> On Jul 3, 9:10 pm, Arjan van de Ven <ar...@...radead.org> wrote:
>> > On Fri, 3 Jul 2009 21:54:58 +0200
>> >
>> > Andi Kleen <a...@...stfloor.org> wrote:
>> > > > That would seem to be a fruitful avenue of investigation --
>> > > > whether limiting the cards to a single RX/TX interrupt would be
>> > > > advantageous, or whether spreading the eight interrupts
>> out over
>> > > > the CPUs would be advantageous.
>> >
>> > > The kernel should really do the per cpu binding of MSIs
>> by default.
>> >
>> > ... so that you can't do power management on a per socket basis?
>> > hardly a good idea.
>> >
>> > just need to use a new enough irqbalance and it will spread out the
>> > interrupts unless your load is low enough to go into low power mode.
>>
>> I was finding newer kernels (>~2.6.24) would set the
>> Redirection Hint bit in the MSI address vector, allowing the
>> processors to deliver the interrupt to the lowest interrupt
>> priority (eg idle, no powersave) core
>> (http://www.intel.com/Assets/PDF/manual/253668.pdf pp10-66)
>> and older irqbalance daemons would periodically naively
>> rewrite the bitmask of cores, delivering the interrupt to a
>> static one.
>>
>> Thus, it may be worth checking if disabling any older
>> irqbalance daemon gives any win.
>>
>> Perhaps there is value in writing different subsets of cores
>> to the MSI address vector core bitmask (with the redirection
>> hint enabled) for different I/O queues on heavy interrupt
>> sources? By default, it's all cores.
>>
>
> Possible enhancement -
>
> 1) Drain the responses in the xmit_frame() path. That is, post the TX-request() and just before returning see if there are
>   any more responses in the RX-queue. This will minimize(only if the NIC f/w coalesces) interrupt load.
>   The n/w core should drain the responses rather than calling the drain-routine from the adapter's xmit_frame() handler. This way there won't be any need to
>   modify individual xmit_frame handlers.

The problem of additional checking on such a hot path, is each
(synchronous) read over the PCIe bus takes ~1us, which is the same
order of cost of executing 1000 instructions (and getting greater with
faster processors and deeper serial buses). Perhaps it's sufficiently
low cost if the NIC's RX queue status/structure was in main memory (vs
registers over PCI).

If latency is not favoured over throughput, increasing the packet
coalescing watermarks may reduce interrupt rate and thus some
performance loss?

Daniel
-- 
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ