lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Nov 2015 18:25:05 +0000
From:	Måns Rullgård <mans@...sr.com>
To:	David Miller <davem@...emloft.net>
Cc:	romieu@...zoreil.com, linux-kernel@...r.kernel.org,
	netdev@...r.kernel.org, slash.tmp@...e.fr
Subject: Re: [PATCH v5] net: ethernet: add driver for Aurora VLSI NB8800 Ethernet controller

David Miller <davem@...emloft.net> writes:

> From: Måns Rullgård <mans@...sr.com>
> Date: Wed, 11 Nov 2015 13:04:07 +0000
>
>> Måns Rullgård <mans@...sr.com> writes:
>> 
>>> David Miller <davem@...emloft.net> writes:
>>>
>>>> From: Måns Rullgård <mans@...sr.com>
>>>> Date: Wed, 11 Nov 2015 00:40:09 +0000
>>>>
>>>>> When the DMA complete interrupt arrives, the next chain should be
>>>>> kicked off as quickly as possible, and I don't see why that would
>>>>> benefit from being done in napi context.
>>>>
>>>> NAPI isn't about low latency, it's about fairness and interrupt
>>>> mitigation.
>>>>
>>>> You probably don't even realize that all of the TX SKB freeing you do
>>>> in the hardware interrupt handler end up being actually processed by a
>>>> scheduled software interrupt anyways.
>>>>
>>>> So you are gaining almost nothing by not doing TX completion in NAPI
>>>> context, whereas by doing so you would be gaining a lot including
>>>> more simplified locking or even the ability to do no locking at all.
>>>
>>> TX completion is separate from restarting the DMA, and moving that to
>>> NAPI may well be a good idea.  Should I simply napi_schedule() if the
>>> hardware indicates TX is complete and do the cleanup in the NAPI poll
>>> function?
>> 
>> I tried that, and throughput (as measured by iperf3) dropped by 2%.
>> Maybe I did something wrong.
>
> Did you fix all the locking in that change?
>
> Since all of your TX handling runs in software interrupt context, you
> can stop using IRQ locking and use BH locking driver-wide instead.
>
> And actually, no locking is really needed for TX processing.  With
> proper memory barriers and properly crafter queue state tests, you
> can run completely lockless.
>
> Again, look at example drivers.  I know, for example, that
> drivers/net/ethernet/broadcom/tg3.c runs TX lockless.  You'll
> see that tg3_tx() takes no locks at all.

The way the hardware works, once a DMA operation has been started,
adding more packets to the active chain can't be done reliably.  For
that reason, if start_xmit is called (with xmit_more zero) while a DMA
operation is in progress, the new packet(s) must be queued until the
hardware raises the DMA complete interrupt.  At that time, the next
pending DMA chain, if any, can be kicked off.  If the TX DMA channel is
idle when start_xmit is called, it can be started immediately.  Checking
the DMA status and starting it if idle has to be done atomically
somehow.

There is a separate indication for actual TX completion, and the
interrupt for that can be set to only fire every 7 frames or when a
timeout expires.  When this happens, the TX cleanup needs to run, and
that can obviously be done from NAPI without using any locks.

Bear in mind that this hardware is quite primitive compared to modern
high-performance Ethernet controllers from the likes of Intel and
Broadcom.  The documentation I have is dated 2003.

-- 
Måns Rullgård
mans@...sr.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ