lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <051e3043-8b58-0591-36e3-99e2267f67f4@gmx.de>
Date:   Thu, 8 Dec 2016 21:32:12 +0100
From:   Lino Sanfilippo <LinoSanfilippo@....de>
To:     Francois Romieu <romieu@...zoreil.com>
Cc:     bh74.an@...sung.com, ks.giri@...sung.com, vipul.pandya@...sung.com,
        peppe.cavallaro@...com, alexandre.torgue@...com, pavel@....cz,
        davem@...emloft.net, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock

Hi,

On 08.12.2016 00:15, Francois Romieu wrote:
> Lino Sanfilippo <LinoSanfilippo@....de> :
>> The driver uses a private lock for synchronization between the xmit
>> function and the xmit completion handler, but since the NETIF_F_LLTX flag
>> is not set, the xmit function is also called with the xmit_lock held.
>> 
>> On the other hand the xmit completion handler first takes the private lock
>> and (in case that the tx queue has been stopped) the xmit_lock, leading
>> to a reverse locking order and the potential danger of a deadlock.
> 
> netif_tx_stop_queue is used by:
> 1. xmit function before releasing lock and returning.
> 2. sxgbe_restart_tx_queue()
>    <- sxgbe_tx_interrupt
>    <- sxgbe_reset_all_tx_queues()
>       <- sxgbe_tx_timeout()
> 
> Given xmit won't be called again until tx queue is enabled, it's not clear
> how a deadlock could happen due to #1.
> 


After spending more thoughts on this I tend to agree with you. Yes, we have the
different locking order for the xmit_lock and the private lock in two concurrent
threads. And one of the first things one learns about locking is that this is a
good way to create a deadlock sooner or later. But in our case the deadlock 
can only occur if the xmit function and the tx completion handler perceive different
 states for the tx queue, or to be more specific: 
the completion handler sees the tx queue in state "stopped" while the xmit handler 
sees it in state "running" at the same time. Only then both functions would try to
take both locks, which could lead to a deadlock.

OTOH Pavel said that he actually could produce a deadlock. Now I wonder if this is caused
by that locking scheme (in a way I have not figured out yet) or if it is a different issue.

Regards,
Lino

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ