lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <afcae723-5804-305f-2756-c4dbf9045943@gmx.de>
Date:   Wed, 7 Dec 2016 23:34:19 +0100
From:   Lino Sanfilippo <LinoSanfilippo@....de>
To:     Pavel Machek <pavel@....cz>
Cc:     bh74.an@...sung.com, ks.giri@...sung.com, vipul.pandya@...sung.com,
        peppe.cavallaro@...com, alexandre.torgue@...com,
        davem@...emloft.net, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org
Subject: Re: [PATCH 2/2] net: ethernet: stmmac: remove private tx queue lock

On 07.12.2016 22:43, Lino Sanfilippo wrote:
> Hi Pavel,
> 
> On 07.12.2016 22:37, Pavel Machek wrote:
>> On Wed 2016-12-07 21:05:38, Lino Sanfilippo wrote:
>>> The driver uses a private lock for synchronization between the xmit
>>> function and the xmit completion handler, but since the NETIF_F_LLTX flag
>>> is not set, the xmit function is also called with the xmit_lock held.
>>> 
>>> On the other hand the xmit completion handler first takes the private lock
>>> and (in case that the tx queue has been stopped) the xmit_lock, leading to
>>> a reverse locking order and the potential danger of a deadlock.
>>> 
>>> Fix this by removing the private lock completely and synchronizing the xmit
>>> function and completion handler solely by means of the xmit_lock. By doing
>>> this remove also the now unnecessary double check for a stopped tx queue.
>>> 
>> 
>> FYI, here's modified version. I believe _bh versions are needed, and
>> I'm testing that version now. (Oh and I also ported it to net-next).
>> 
>> It survived 30 minutes of testing so far...
>> 
> 
> First off, thanks for testing.
> Hmm. I dont understand why _bh would be needed. We call that function from
> BH context only (napi poll and timer).
> Any idea?
> 

Could this once again be caused by irq coalescing? When the tx queue has been stopped
the cleanup handler has to wakeup the queue within a certain time span, otherwise the
watchdog will complain (as it happened in your test). Could you retest this with
irq coalescing disabled?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ