lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1191882618.4373.99.camel@localhost>
Date:	Mon, 08 Oct 2007 18:30:18 -0400
From:	jamal <hadi@...erus.ca>
To:	David Miller <davem@...emloft.net>
Cc:	jeff@...zik.org, peter.p.waskiewicz.jr@...el.com,
	krkumar2@...ibm.com, johnpol@....mipt.ru,
	herbert@...dor.apana.org.au, kaber@...sh.net,
	shemminger@...ux-foundation.org, jagana@...ibm.com,
	Robert.Olsson@...a.slu.se, rick.jones2@...com, xma@...ibm.com,
	gaagaan@...il.com, netdev@...r.kernel.org, rdreier@...co.com,
	mingo@...e.hu, mchan@...adcom.com, general@...ts.openfabrics.org,
	kumarkr@...ux.ibm.com, tgraf@...g.ch, randy.dunlap@...cle.com,
	sri@...ibm.com, linux-kernel@...r.kernel.org
Subject: Re: parallel networking

On Mon, 2007-08-10 at 14:11 -0700, David Miller wrote:

> The problem is that the packet schedulers want global guarantees
> on packet ordering, not flow centric ones.
> 
> That is the issue Jamal is concerned about.

indeed, thank you for giving it better wording. 

> The more I think about it, the more inevitable it seems that we really
> might need multiple qdiscs, one for each TX queue, to pull this full
> parallelization off.
> 
> But the semantics of that don't smell so nice either.  If the user
> attaches a new qdisc to "ethN", does it go to all the TX queues, or
> what?
> 
> All of the traffic shaping technology deals with the device as a unary
> object.  It doesn't fit to multi-queue at all.

If you let only one CPU at a time access the "xmit path" you solve all
the reordering. If you want to be more fine grained you make the
serialization point as low as possible in the stack - perhaps in the
driver.
But I think even what we have today with only one cpu entering the
dequeue/scheduler region, _for starters_, is not bad actually ;->  What
i am finding (and i can tell you i have been trying hard;->) is that a
sufficiently fast cpu doesnt sit in the dequeue area for "too long" (and
batching reduces the time spent further). Very quickly there are no more
packets for it to dequeue from the qdisc or the driver is stoped and it
has to get out of there. If you dont have any interupt tied to a
specific cpu then you can have many cpus enter and leave that region all
the time. 

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ