lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d314e471-0251-461e-988d-70add0c6ebf6@I-love.SAKURA.ne.jp>
Date: Thu, 4 Jan 2024 19:34:59 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: Hillf Danton <hdanton@...a.com>,
        syzbot <syzbot+2b131f51bb4af224ab40@...kaller.appspotmail.com>,
        Alan Stern <stern@...land.harvard.edu>,
        Greg KH <gregkh@...uxfoundation.org>
Cc: krzysztof.kozlowski@...aro.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [net?] [nfc?] INFO: task hung in nfc_targets_found

On 2024/01/04 14:05, Hillf Danton wrote:
> On Wed, 03 Jan 2024 16:59:25 -0800
>> HEAD commit:    453f5db0619e Merge tag 'trace-v6.7-rc7' of git://git.kerne..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=141bc48de80000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=f8e72bae38c079e4
>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b131f51bb4af224ab40
>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>
> 
> 	syz-executor.1:27827		kworker/u4:93/7607	kworker/0:1/11541
> 	===				===			===
> 	nci_close_device()		nci_rx_work()		nfc_urelease_event_work()
> 	mutex_lock(&ndev->req_lock)				device_lock()
> 	flush_workqueue(ndev->rx_wq)				mutex_lock(&ndev->req_lock)
> 					device_lock()
> 
> Looks like lockdep failed to detect deadlock once more because of device_lock().

Yes, this is yet another circular locking dependency hidden by device_lock().

Calling flush_workqueue(ndev->rx_wq) with ndev->req_lock has to be avoided,
for nci_close_device() has ndev->req_lock => dev->dev dependency and
nfc_urelease_event_work() has dev->dev => ndev->req_lock dependency.

  nci_close_device() {
    mutex_lock(&ndev->req_lock); // ffff88802bed4350
    flush_workqueue(ndev->rx_wq); // wait for nci_rx_work() to complete
    mutex_unlock(&ndev->req_lock); // ffff88802bed4350
  }
  
  nci_rx_work() { // ndev->rx_work is on ndev->rx_wq
    nci_ntf_packet() {
      device_lock(&dev->dev); // ffff88802bed5100
      device_unlock(&dev->dev); // ffff88802bed5100
    }
  }

  nfc_urelease_event_work() {
    mutex_lock(&nfc_devlist_mutex); // ffffffff8ee4d808
    mutex_lock(&dev->genl_data.genl_data_mutex); // ffff88802bed5508
    nfc_stop_poll() {
      device_lock(&dev->dev); // ffff88802bed5100
      nci_stop_poll() {
        nci_request() {
          mutex_lock(&ndev->req_lock); // ffff88802bed4350
          mutex_unlock(&ndev->req_lock); // ffff88802bed4350
        }
      }
      device_unlock(&dev->dev); // ffff88802bed5100
    }
    mutex_unlock(&dev->genl_data.genl_data_mutex); // ffff88802bed5508
    mutex_unlock(&nfc_devlist_mutex); // ffffffff8ee4d808
  }

I consider that we need to enable lockdep validation on dev->dev mutex
( https://lkml.kernel.org/r/c7fb01a9-3e12-77ed-5c4c-db7deb64dc73@I-love.SAKURA.ne.jp )
but was some alternative to my proposal at
https://lkml.kernel.org/r/1ad499bb-0c53-7529-ff00-e4328823f6fa@I-love.SAKURA.ne.jp
proposed? Is it time to retry my proposal?


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ