netdev - Re: [PATCH v6] net: batch skb dequeueing from softnet input_pkt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <1272568347.2209.11.camel@edumazet-laptop>
Date:	Thu, 29 Apr 2010 21:12:27 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Andi Kleen <ak@...goyle.fritz.box>,
	Andi Kleen <andi@...stfloor.org>
Cc:	hadi@...erus.ca, Changli Gao <xiaosuo@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Tom Herbert <therbert@...gle.com>,
	Stephen Hemminger <shemminger@...tta.com>,
	netdev@...r.kernel.org, Andi Kleen <andi@...stfloor.org>,
	lenb@...nel.org, arjan@...radead.org
Subject: Re: [PATCH v6] net: batch skb dequeueing from softnet
 input_pkt_queue

Le jeudi 29 avril 2010 à 20:23 +0200, Andi Kleen a écrit :
> On Thu, Apr 29, 2010 at 07:56:12PM +0200, Eric Dumazet wrote:
> > Le jeudi 29 avril 2010 à 19:42 +0200, Andi Kleen a écrit :
> > > > Andi, what do you think of this one ?
> > > > Dont we have a function to send an IPI to an individual cpu instead ?
> > > 
> > > That's what this function already does. You only set a single CPU 
> > > in the target mask, right?
> > > 
> > > IPIs are unfortunately always a bit slow. Nehalem-EX systems have X2APIC
> > > which is a bit faster for this, but that's not available in the lower
> > > end Nehalems. But even then it's not exactly fast.
> > > 
> > > I don't think the IPI primitive can be optimized much. It's not a cheap 
> > > operation.
> > > 
> > > If it's a problem do it less often and batch IPIs.
> > > 
> > > It's essentially the same problem as interrupt mitigation or NAPI 
> > > are solving for NICs. I guess just need a suitable mitigation mechanism.
> > > 
> > > Of course that would move more work to the sending CPU again, but 
> > > perhaps there's no alternative. I guess you could make it cheaper it by
> > > minimizing access to packet data.
> > > 
> > > -Andi
> > 
> > Well, IPI are already batched, and rate is auto adaptative.
> > 
> > After various changes, it seems things are going better, maybe there is
> > something related to cache line trashing.
> > 
> > I 'solved' it by using idle=poll, but you might take a look at
> > clockevents_notify (acpi_idle_enter_bm) abuse of a shared and higly
> > contended spinlock...
> 
> acpi_idle_enter_bm should not be executed on a Nehalem, it's obsolete.
> If it does on your system something is wrong.
> 
> Ahh, that triggers a bell. There's one issue that if the remote CPU is in a very
> deep idle state it could take a long time to wake it up. Nehalem has deeper
> sleep states than earlier CPUs. When this happens the IPI sender will be slow
> too I believe.
> 
> Are the target CPUs idle? 
> 

Yes, mostly, but about 200.000 wakeups per second I would say...

If a cpu in deep state receives an IPI, process a softirq, should it
come back to deep state immediately, or should it wait for some
milliseconds ?

> Perhaps need to feed some information to cpuidle's governour to prevent this problem.
> 
> idle=poll is very drastic, better to limit to C1 
> 

How can I do this ?

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html