lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1275556440.2456.19.camel@edumazet-laptop>
Date:	Thu, 03 Jun 2010 11:14:00 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Mitchell Erblich <erblichs@...thlink.net>
Cc:	netdev@...r.kernel.org
Subject: Re: Proposed linux kernel changes : scaling  tcp/ip stack

Le jeudi 03 juin 2010 à 01:16 -0700, Mitchell Erblich a écrit :
> To whom it may concern,
> 
> First, my assumption is to keep this discussion local to just a few tcp/ip
> developers to see if there is any consensus that the below is a logical 
> approach. Please also pass this email if there is a "owner(s)" of this stack
> to identify if a case exists for the below possible changes.
> 
> I am not currently on the linux kernel mail group.
> 			
> I have experience with modifications of the Linux tcp/ip stack, and have
> merged the changes into the company's local tree and left the possible 
> global integration to others.
> 
> I have been approached by a number of companies about scaling the
> stack with the assumption of a number of cpu cores. At present, I find extra
> time on my hands and am considering looking into this area on my own.
> 
> The first assumption is that if extra cores are available, that a single
> received homogeneous flow of a large number of packets/segments per
> second (pps) can be split into non-equal flows. This split can in effect
> allow a larger recv'd pps rate at the same core load while splitting off
> other workloads, such as xmit'ing pure ACKs.
> 
> Simply, again assuming Amdahl's law (and not looking to equalize the load
> between cores), and creating logical separations where in a many core 
> system, different cores could have new kernel threads  that operate in 
> parallel within the tcp/ip stack. The initial separation points would be at 
> the ip/tcp layer boundry and where any recv'd sk/pkt would generate some 
> form of output.
> 
> The ip/tcp layer would be split like the vintage AT&T STREAMs protocol,
> with some form of queuing & scheduling, would be needed. In addition,
> the queuing/schedullng of other kernel threads would occur within ip & tcp
> to separate the I/O.
> 
> A possible validation test is to identify the max recv'd pps rate within the
> tcp/ip modules within normal flow TCP established state with normal order 
> of say 64byte non fragmented segments, before and after each 
> incremental change. Or the same rate with fewer core/cpu cycles.
> 
> I am willing to have a private git Linux.org tree that concentrates proposed
> changes into this tree and if there is willingness, a seen want/need then identify
> how to implement the merge.

Hi Mitchell

We work everyday to improve network stack, and standard linux tree is
pretty scalable, you dont need to setup a separate git tree for that.

Our beloved maintainer David S. Miller handles two trees, net-2.6 and
net-next-2.6 where we put all our changes.

http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6.git

I suggest you read the last patches (say .. about 10.000 of them), to
have an idea of things we did during last years.

keywords : RCU, multiqueue, RPS, percpu data, lockless algos, cache line
placement...

Its nice to see another man joining the team !

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ