lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR11MB37994715A3DD8259DF16A34DBA950@BYAPR11MB3799.namprd11.prod.outlook.com>
Date:   Wed, 24 Jun 2020 06:32:36 +0000
From:   "Christian Benvenuti (benve)" <benve@...co.com>
To:     Kaige Li <likaige@...ngson.cn>, David Miller <davem@...emloft.net>
CC:     "_govind@....com" <_govind@....com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "lixuefeng@...ngson.cn" <lixuefeng@...ngson.cn>,
        "yangtiezhu@...ngson.cn" <yangtiezhu@...ngson.cn>
Subject: RE: [PATCH RESEND] net/cisco: Fix a sleep-in-atomic-context bug in
 enic_init_affinity_hint()

> -----Original Message-----
> From: netdev-owner@...r.kernel.org <netdev-owner@...r.kernel.org>
> On Behalf Of Kaige Li
> Sent: Tuesday, June 23, 2020 8:41 PM
> To: David Miller <davem@...emloft.net>
> Cc: Christian Benvenuti (benve) <benve@...co.com>; _govind@....com;
> netdev@...r.kernel.org; linux-kernel@...r.kernel.org;
> lixuefeng@...ngson.cn; yangtiezhu@...ngson.cn
> Subject: Re: [PATCH RESEND] net/cisco: Fix a sleep-in-atomic-context bug in
> enic_init_affinity_hint()
> 
> 
> On 06/24/2020 11:23 AM, David Miller wrote:
> > From: Kaige Li <likaige@...ngson.cn>
> > Date: Wed, 24 Jun 2020 09:56:47 +0800
> >
> >> On 06/24/2020 06:26 AM, David Miller wrote:
> >>> From: David Miller <davem@...emloft.net>
> >>> Date: Tue, 23 Jun 2020 14:33:11 -0700 (PDT)
> >>>
> >>>> Calling a NIC driver open function from a context holding a
> >>>> spinlock is very much the real problem, so many operations have to
> >>>> sleep and in face that ->ndo_open() method is defined as being
> >>>> allowed to sleep and that's why the core networking never invokes
> >>>> it with spinlocks
> >>>                                                         ^^^^
> >>>
> >>> I mean "without" of course. :-)
> >>>
> >>>> held.
> >> Did you mean that open function should be out of spinlock? If so, I
> >> will send V2 patch.
> > Yes, but only if that is safe.
> >
> > You have to analyze the locking done by this driver and fix it properly.
> > I anticipate it is not just a matter of changing where the spinlock
> > is held, you will have to rearchitect things a bit.
> 
> Okay, I will careful analyze this question, and make a suitable patch in V2.
> 
> Thank you.

Hi David, Kaige,
I assume you are referring to the enic_api_lock spin_lock used in

  enic_reset()
  which is used to hard-reset the interface when the driver receives an error interrupt

and

  enic_tx_hang_reset()
  which is used to soft-reset the interface when the stack detects the TX timeout.

Both reset functions above are called in the context of a workqueue.
However, the same spin_lock (enic_api_lock) is taken by the enic_api_devcmd_proxy_by_index() api that is exported by the enic driver and that the usnic_verbs driver uses to send commands to the firmware.
This spin_lock was likely added to guarantee that no firmware command is sent by usnic_verbs to an enic interface that is undergoing a reset.
Unfortunately changing that spin_lock to a mutex will likely not work on the usnic_verbs side, and removing the spin_lock will require a rearchitect of the code as mentioned by David.

Kaige, V2 is of course more than welcome and we can test it too.
We/Cisco will also look into it, hopefully a small code reorg will be sufficient.

David, from your previous emails on this 3D I assume
- we can leave request_irq() in ndo_open (ie, no need to move it to pci/probe), which is done by a number of other drivers too.
- no need to change GFP_KERNEL to GFP_ATOMIC as it was suggested in the original patch.

Thanks!
/Chris

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ