lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 15 Apr 2010 01:48:57 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	hadi@...erus.ca
Cc:	eric.dumazet@...il.com, therbert@...gle.com,
	netdev@...r.kernel.org, robert@...julf.net, xiaosuo@...il.com,
	andi@...stfloor.org
Subject: Re: rps perfomance WAS(Re: rps: question

From: jamal <hadi@...erus.ca>
Date: Wed, 14 Apr 2010 14:53:42 -0400

> On Wed, 2010-04-14 at 20:04 +0200, Eric Dumazet wrote:
> 
>> Yes, multiqueue is far better of course, but in case of hardware lacking
>> multiqueue, RPS can help many workloads, where application has _some_
>> work to do, not only counting frames or so...
> 
> Agreed. So to enumerate, the benefits come in if:
> a) you have many processors
> b) you have single-queue nic
> c) at sub-threshold traffic you dont care about a little latency
> d) you have a specific cache hierachy
> e) app is working hard to process incoming messages

A single-queue NIC is actually not a requirement, RPS helps also in
cases where you have 'N' application threads and N is less than the
number of CPUs your multi-queue NIC is distributing traffic to.

Moving the bulk of the input packet processing to the cpus where
the applications actually sit had a non-trivial benefit.  RFS takes
this aspect to yet another level.

> I think the main challenge for my pedantic mind is missing details. Is
> there a paper on rps? Example for #d above, the commit log mentions that
> rps benefits if you have certain types of "cache hierachy". Probably
> some arch with large shared L2/3 (maybe inclusive) cache will benefit.
> example: it does well on Nehalem and probably opterons as long (as you
> dont start stacking these things on some interconnect like QPI or HT).
> But what happens when you have FSB sharing across cores (still a very
> common setup)? etc etc

I think for the case where application locality is important,
RPS/RFS can help regardless of cache details.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists