lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B79C522.4040405@caviumnetworks.com>
Date:	Mon, 15 Feb 2010 14:05:22 -0800
From:	David Daney <ddaney@...iumnetworks.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	ralf@...ux-mips.org, linux-mips@...ux-mips.org,
	netdev@...r.kernel.org, gregkh@...e.de
Subject: Re: [PATCH 4/4] Staging: Octeon:  Free transmit SKBs in a timely
 manner.

On 02/15/2010 01:11 PM, Eric Dumazet wrote:
> Le lundi 15 février 2010 à 12:41 -0800, David Daney a écrit :
>> On 02/15/2010 12:27 PM, Eric Dumazet wrote:
>>> Le lundi 15 février 2010 à 12:13 -0800, David Daney a écrit :
>>>> If we wait for the once-per-second cleanup to free transmit SKBs,
>>>> sockets with small transmit buffer sizes might spend most of their
>>>> time blocked waiting for the cleanup.
>>>>
>>>> Normally we do a cleanup for each transmitted packet.  We add a
>>>> watchdog type timer so that we also schedule a timeout for 150uS after
>>>> a packet is transmitted.  The watchdog is reset for each transmitted
>>>> packet, so for high packet rates, it never expires.  At these high
>>>> rates, the cleanups are done for each packet so the extra watchdog
>>>> initiated cleanups are not needed.
>>>
>>> s/needed/fired/
>>>
>>
>> or perhaps s/are not needed/are neither needed nor fired/
>>
>>> Hmm, but re-arming a timer for each transmited packet must have a cost ?
>>>
>>
>> The cost is fairly low (less than 10 processor clock cycles).  We didn't
>> add this for amusement, people actually do things like only send UDP
>> packets from userspace.  Since we can fill the transmit queue faster
>> than it is emptied, the socket transmit buffer is quickly consumed.  If
>> we don't free the SKBs in short order, the transmitting process get to
>> take a long sleep (until our previous once per second clean up task was
>> run).
>
> I understand this, but traditionaly, NIC drivers dont use a timer, but a
> 'TX complete' interrupt, that usually fires a few us after packet
> submission on Gigabit speed.
>

Indeed.  Lacking this type of interrupt, the watchdog seemed the best 
short term solution.

I am investigating the possibility of feeding TX complete notifications 
back through the RX path where it is possible to generate interrupts. 
The drawback to this is that it takes a lot more CPU cycles as well as 
added cache pressure.

> A fast program could try to send X small udp packets in less than 150
> us, X being greater than the size of your TX ring.

My TX queue (it is not a ring) size can be made arbitrarily large 
(currently 1000).  64bytes * 1000 packets * 10 bits/packet / 10e9 
bits/sec  == 640uS.  My watchdog will fire after less than 1/4 of the 
ring capacity is freed.

>
> So your patch makes the window smaller, but it still is there (at
> physical layer, we'll see a burst of packets, a ~100us delay, then a
> second burst)
>

With this patch, there will be no burstiness using default socket buffer 
sizes and packets of arbitrary size on a standard 1gig port.

On the 10gig ports there is the possibility for burstiness as you aptly 
explain.  However, in practice it would be difficult to arrange things 
to achieve sufficiently high packet rates, so we can live with it like this.

David Daney
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ