lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250303160355.5f8d82d8@kernel.org>
Date: Mon, 3 Mar 2025 16:03:55 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Joe Damato <jdamato@...tly.com>
Cc: netdev@...r.kernel.org, mkarsten@...terloo.ca,
 gerhard@...leder-embedded.com, jasowang@...hat.com,
 xuanzhuo@...ux.alibaba.com, mst@...hat.com, leiyang@...hat.com, Eugenio
 PĂ©rez <eperezma@...hat.com>, Andrew Lunn
 <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, Eric
 Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, "open
 list:VIRTIO CORE AND NET DRIVERS" <virtualization@...ts.linux.dev>, open
 list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next v5 3/4] virtio-net: Map NAPIs to queues

On Mon, 3 Mar 2025 13:33:10 -0500 Joe Damato wrote:
> > > @@ -2880,6 +2880,13 @@ static void refill_work(struct work_struct *work)
> > >         bool still_empty;
> > >         int i;
> > > 
> > > +       spin_lock(&vi->refill_lock);
> > > +       if (!vi->refill_enabled) {
> > > +               spin_unlock(&vi->refill_lock);
> > > +               return;
> > > +       }
> > > +       spin_unlock(&vi->refill_lock);
> > > +
> > >         for (i = 0; i < vi->curr_queue_pairs; i++) {
> > >                 struct receive_queue *rq = &vi->rq[i];
> > >  
> > 
> > Err, I suppose this also doesn't work because:
> > 
> > CPU0                       CPU1
> > rtnl_lock                  (before CPU0 calls disable_delayed_refill) 
> >   virtnet_close            refill_work
> >                              rtnl_lock()
> >   cancel_sync <= deadlock
> > 
> > Need to give this a bit more thought.  
> 
> How about we don't use the API at all from refill_work?
> 
> Patch 4 adds consistent NAPI config state and refill_work isn't a
> queue resize maybe we don't need to call the netif_queue_set_napi at
> all since the NAPI IDs are persisted in the NAPI config state and
> refill_work shouldn't change that?
> 
> In which case, we could go back to what refill_work was doing
> before and avoid the problem entirely.
> 
> What do you think ?

Should work, I think. Tho, I suspect someone will want to add queue API
support to virtio sooner or later, and they will run into the same
problem with the netdev instance lock, as all of ndo_close() will then
be covered with netdev->lock.

More thorough and idiomatic way to solve the problem would be to cancel
the work non-sync in ndo_close, add cancel with _sync after netdev is
unregistered (in virtnet_remove()) when the lock is no longer held, then
wrap the entire work with a relevant lock and check if netif_running()
to return early in case of a race.

Middle ground would be to do what you suggested above and just leave 
a well worded comment somewhere that will show up in diffs adding queue
API support?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ