lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 17 Oct 2022 09:22:58 +0200 From: Íñigo Huguet <ihuguet@...hat.com> To: Andrew Lunn <andrew@...n.ch> Cc: irusskikh@...vell.com, dbogdanov@...vell.com, davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, netdev@...r.kernel.org, Li Liang <liali@...hat.com> Subject: Re: [PATCH net] atlantic: fix deadlock at aq_nic_stop On Sat, Oct 15, 2022 at 5:10 PM Andrew Lunn <andrew@...n.ch> wrote: > > > > Maybe the lock needs to be moved closer to what actually needs to be > > > protect? What is it protecting? > > > > It's protecting the operations of aq_macsec_enable and aq_macsec_work. > > The locking was closer to them, but the idea of this patch is to move > > the locking to an earlier moment so, in the case we need to abort, do > > it before changing anything. > > aq_check_txsa_expiration() seems to be one of the issues? At least, > the lock is taken before and released afterwards. So what in > aq_check_txsa_expiration() requires the lock? Basically everything in the file aq_macsec.c seems to be implicitly protected by rtnl_lock. One group of functions are all callbacks of the `struct macsec_ops aq_macsec_ops`, which are responsible for configuring macsec offload, all called under rtnl_lock. The rest of the functions in the file are called from ethtool, also protected by rtnl_lock. And part of the problem is that many of these operations are firmware and/or phy configurations which I don't have documentation about how they work. Despite this, it seems reasonable to think that they need to be lock protected. > I don't like the use of rtnl_trylock(). It suggests the basic design is > wrong, or overly complex, and so probably not working correctly. > > https://blog.ffwll.ch/2022/07/locking-engineering.html > > Please try to identify what is being protected. If it is driver > internal state, could it be replaced with a driver mutex, rather than > RTNL? Or is it network stack as a whole state, which really does > require RTNL? If so, how do other drivers deal with this problem? Is > it specific to MACSEC? Does MACSEC have a design problem? I already considered this possibility but discarded it because, as I say above, everything else is already legitimately protected by rtnl_lock. The only alternative I can think of is to add a driver only mutex (let's call it aq_macsec_mutex), as you say, and everytime that macsec offload is to be changed both rtnl_lock and aq_macsec_mutex would be taken. It's true that aq_macsec_mutex wouldn't be much contended because almost always rtnl_lock needs to be acquired first. From the workqueue and the threaded irq there wouldn't be any deadlock because they only hold aq_macsec_mutex and ndo_stop only holds rtnl_lock. I would also allow to put the locking close to what they protect. I thought that this solution would be a bit overkill, but maybe it's less overkill than the one I chose. If you're OK with this, I can prepare an v2. -- Íñigo Huguet
Powered by blists - more mailing lists