netdev - Re: parallel networking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1191882618.4373.99.camel@localhost>
Date:	Mon, 08 Oct 2007 18:30:18 -0400
From:	jamal <hadi@...erus.ca>
To:	David Miller <davem@...emloft.net>
Cc:	jeff@...zik.org, peter.p.waskiewicz.jr@...el.com,
	krkumar2@...ibm.com, johnpol@....mipt.ru,
	herbert@...dor.apana.org.au, kaber@...sh.net,
	shemminger@...ux-foundation.org, jagana@...ibm.com,
	Robert.Olsson@...a.slu.se, rick.jones2@...com, xma@...ibm.com,
	gaagaan@...il.com, netdev@...r.kernel.org, rdreier@...co.com,
	mingo@...e.hu, mchan@...adcom.com, general@...ts.openfabrics.org,
	kumarkr@...ux.ibm.com, tgraf@...g.ch, randy.dunlap@...cle.com,
	sri@...ibm.com, linux-kernel@...r.kernel.org
Subject: Re: parallel networking

On Mon, 2007-08-10 at 14:11 -0700, David Miller wrote:

> The problem is that the packet schedulers want global guarantees
> on packet ordering, not flow centric ones.
> 
> That is the issue Jamal is concerned about.

indeed, thank you for giving it better wording. 

> The more I think about it, the more inevitable it seems that we really
> might need multiple qdiscs, one for each TX queue, to pull this full
> parallelization off.
> 
> But the semantics of that don't smell so nice either.  If the user
> attaches a new qdisc to "ethN", does it go to all the TX queues, or
> what?
> 
> All of the traffic shaping technology deals with the device as a unary
> object.  It doesn't fit to multi-queue at all.

If you let only one CPU at a time access the "xmit path" you solve all
the reordering. If you want to be more fine grained you make the
serialization point as low as possible in the stack - perhaps in the
driver.
But I think even what we have today with only one cpu entering the
dequeue/scheduler region, _for starters_, is not bad actually ;->  What
i am finding (and i can tell you i have been trying hard;->) is that a
sufficiently fast cpu doesnt sit in the dequeue area for "too long" (and
batching reduces the time spent further). Very quickly there are no more
packets for it to dequeue from the qdisc or the driver is stoped and it
has to get out of there. If you dont have any interupt tied to a
specific cpu then you can have many cpus enter and leave that region all
the time. 

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html