netdev - Re: [RFC] New driver API to speed up small packets xmits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 10 May 2007 15:09:56 -0700
From:	Rick Jones <rick.jones2@...com>
To:	Eric Dumazet <dada1@...mosbay.com>
Cc:	David Stevens <dlstevens@...ibm.com>,
	Evgeniy Polyakov <johnpol@....mipt.ru>,
	Krishna Kumar2 <krkumar2@...ibm.com>, netdev@...r.kernel.org,
	netdev-owner@...r.kernel.org
Subject: Re: [RFC] New driver API to speed up small packets xmits

Eric Dumazet wrote:
> David Stevens a écrit :
> 
>> The word "small" is coming up a lot in this discussion, and
>> I think packet size really has nothing to do with it. Multiple
>> streams generating packets of any size would benefit; the
>> key ingredient is a queue length greater than 1.
>>
>> I think the intent is to remove queue lock cycles by taking
>> the whole list (at least up to the count of free ring buffers)
>> when the queue is greater than one packet, thus effectively
>> removing the lock expense for n-1 packets.
>>
> 
> Yes, but on modern cpus, locked operations are basically free once the 
> CPU already has the cache line in exclusive access in its L1 cache.

But will it here?  Any of the CPUs are trying to add things to the qdisc, but 
only one CPU is pulling from it right?  Even if the "pulling from it" is 
happening in a loop, there can be scores or more other cores trying to add 
things to the queue, which would cause that cache line to migrate.

> I am not sure adding yet another driver API will help very much.
> It will for sure adds some bugs and pain.

That could very well be.

> A less expensive (and less prone to bugs) optimization would be to 
> prefetch one cache line for next qdisc skb, as a cache line miss is far 
> more expensive than a locked operation (if lock already in L1 cache of 
> course)

Might they not build on on top of the other?

rick jones


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html