lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 14 Oct 2022 15:44:19 +0200
From:   Íñigo Huguet <ihuguet@...hat.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     irusskikh@...vell.com, dbogdanov@...vell.com, davem@...emloft.net,
        edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
        netdev@...r.kernel.org, Li Liang <liali@...hat.com>
Subject: Re: [PATCH net] atlantic: fix deadlock at aq_nic_stop

On Fri, Oct 14, 2022 at 3:35 PM Andrew Lunn <andrew@...n.ch> wrote:
>
> On Fri, Oct 14, 2022 at 02:43:47PM +0200, Íñigo Huguet wrote:
> > On Fri, Oct 14, 2022 at 2:14 PM Andrew Lunn <andrew@...n.ch> wrote:
> > >
> > > > Fix trying to acquire rtnl_lock at the beginning of those functions, and
> > > > returning if NIC closing is ongoing. Also do the "linkstate" stuff in a
> > > > workqueue instead than in a threaded irq, where sleeping or waiting a
> > > > mutex for a long time is discouraged.
> > >
> > > What happens when the same interrupt fires again, while the work queue
> > > is still active? The advantage of the threaded interrupt handler is
> > > that the interrupt will be kept disabled, and should not fire again
> > > until the threaded interrupt handler exits.
> >
> > Nothing happens, if it's already queued, it won't be queued again, and
> > when it runs it will evaluate the last link state. And in the worst
> > case, it will be enqueued to run again, and if linkstate has changed
> > it will be evaluated again. This will rarely happen and it's harmless.
> >
> > Also, I haven't checked it but these lines suggest that the IRQ is
> > auto-disabled in the hw until you enable it again. I didn't rely on
> > this, anyway.
> >         self->aq_hw_ops->hw_irq_enable(self->aq_hw,
> >                                        BIT(self->aq_nic_cfg.link_irq_vec));
> >
> > Honestly I was a bit in doubt on doing this, with the threaded irq it
> > would also work. I'd like to hear more opinions about this and I can
> > change it back.
>
> Ethernet PHYs do all there interrupt handling in threaded IRQs. That
> can require a number of MDIO transactions. So we can be talking about
> 64 bits at 2.5MHz, so 25uS or more. We have not seen issues with that.
>
> > > If MACSEC is enabled, aq_nic_update_link_status() is called with RTNL
> > > held. If it is not enabled, RTNL is not held. This sort of
> > > inconsistency could lead to further locking bugs, since it is not
> > > obvious. Please try to make this consistent.
> >
> > This is not new in these patches, that's what was already happening, I
> > just moved it to get the lock a bit earlier. In my opinion, this is as
> > it should be: why acquire a mutex if you don't have anything to
> > protect with it? And it's worse with rtnl_lock which is held by many
> > processes, and can be held for quite long times...
>
> Maybe the lock needs to be moved closer to what actually needs to be
> protect? What is it protecting?

It's protecting the operations of aq_macsec_enable and aq_macsec_work.
The locking was closer to them, but the idea of this patch is to move
the locking to an earlier moment so, in the case we need to abort, do
it before changing anything.

>
>          Andrew
>


-- 
Íñigo Huguet

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ