netdev - Re: rps: question

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 08 Feb 2010 10:09:08 -0500
From:	jamal <hadi@...erus.ca>
To:	Tom Herbert <therbert@...gle.com>
Cc:	Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org,
	robert@...julf.net, David Miller <davem@...emloft.net>
Subject: Re: rps: question

On Sun, 2010-02-07 at 21:58 -0800, Tom Herbert wrote:

> I don't have specific numbers, although we are using this on
> application doing forwarding and numbers seem in line with what we see
> for an end host.
> 

When i get the chance i will give it a run. I have access to an i7
somewhere. It seems like i need some specific nics?

> No, the cost of the IPIs hasn't been an issue for us performance-wise.
>  We are using them extensively-- up to one per core per device
> interrupt.

Ok, so you are not going across cores then? I wonder if there's
some new optimization to reduce IPI latency  when both sender/receiver
reside on the same core? 

> We're calling __smp_call_function_single which is asynchronous in that
> the caller provides the call structure and there is not waiting for
> the IPI to complete.  A flag is used with each call structure that is
> set when the IPI is in progress, this prevents simultaneous use of a
> call structure.

It is possible that is just an abstraction hiding the details..
AFAIK, IPIs are synchronous. Remote has to ack with another IPI 
while the issuing cpu waits for ack IPI and then returns.

> I haven't seen any architectural specific issues with the IPIs, I
> believe they are completing in < 2 usecs on platforms we're running
> (some opteron systems that are over 3yrs old).

2 usecs aint bad (at 10G you only accumulate a few packets while
stalled). I think we saw much higher values.
I was asking on different architectures because I have tried something
equivalent as recent as 2 years back on a MIPS multicore and the
forwarding results were horrible. 
IPIs flush the processor pipeline so they aint cheap - but that may
vary depending on the architecture. Someone more knowledgeable should
be able to give better insights.
My suspicion is that with low transaction rate (with appropriate traffic
patterns) you will see a very much increased latency since you will 
be sending more IPIs..

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html