netdev - Re: [PATCH iwl-net 0/3] ice: fix synchronization between .ndo

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240613071343.019e7dca@kernel.org>
Date: Thu, 13 Jun 2024 07:13:43 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Larysa Zaremba <larysa.zaremba@...el.com>
Cc: <intel-wired-lan@...ts.osuosl.org>, Jesse Brandeburg
 <jesse.brandeburg@...el.com>, Tony Nguyen <anthony.l.nguyen@...el.com>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Alexei Starovoitov
 <ast@...nel.org>, "Daniel Borkmann" <daniel@...earbox.net>, Jesper Dangaard
 Brouer <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>, Maciej
 Fijalkowski <maciej.fijalkowski@...el.com>, <netdev@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>, <bpf@...r.kernel.org>,
 <magnus.karlsson@...el.com>, Michal Kubiak <michal.kubiak@...el.com>
Subject: Re: [PATCH iwl-net 0/3] ice: fix synchronization between .ndo_bpf()
 and reset

On Thu, 13 Jun 2024 10:54:12 +0200 Larysa Zaremba wrote:
> > > The locking mechanisms I use here do not look pretty, but if I am not missing 
> > > anything, the synchronization they provide must be robust.  
> > 
> > Robust as in they may be correct here, but you lose lockdep and all
> > other infra normal mutex would give you.
> 
> I know, but __netif_queue_set_napi() requires rtnl_lock() inside the potential 
> critical section and creates a deadlock this way. However, after reading 
> patches that introduce this function, I think it is called too early in the
> configuration. Seems like it should be called somewhere right after 
> netif_set_real_num_rx/_tx_queues(), much later in the configuration where we 
> already hold the rtnl_lock(). In such way, ice_vsi_rebuild() could be protected 
> with an internal mutex. WDYT?

On a quick look I think that may work. For setting the NAPI it makes
sense - netif_set_real_num_rx/_tx_queues() and netif_queue_set_napi()
both inform netdev about the queue config, so its logical to keep them
together. I was worried there may be an inconveniently placed
netif_queue_set_napi() call which is clearing the NAPI pointer.
But I don't see one.

> > > A prettier way of protecting the same critical sections would be replacing 
> > > ICE_CFG_BUSY around ice_vsi_rebuild() with rtnl_lock(), this would eliminate 
> > > locking code from .ndo_bpf() altogether, ice_rebuild_pending() logic will have 
> > > to stay.
> > > 
> > > At some point I have decided to avoid using rtnl_lock(), if I do not have to. I 
> > > think this is a goal worth pursuing?  
> > 
> > Is the reset for failure recovery, rather than reconfiguration? 
> > If so netif_device_detach() is generally the best way of avoiding
> > getting called (I think I mentioned it to someone @intal recently).  
> 
> AFAIK, netif_device_detach() does not affect .ndo_bpf() calls. We were trying 
> such approach with idpf and it does work for ethtool, but not for XDP.

I reckon that's an unintentional omission. In theory XDP is "pure
software" but if the device is running driver will likely have to
touch HW to reconfigure. So, if you're willing, do send a ndo_bpf 
patch to add a detached check.