lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251226022727-mutt-send-email-mst@kernel.org>
Date: Fri, 26 Dec 2025 02:37:40 -0500
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Jason Wang <jasowang@...hat.com>
Cc: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, netdev@...r.kernel.org,
	Eugenio Pérez <eperezma@...hat.com>,
	Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	John Fastabend <john.fastabend@...il.com>,
	Stanislav Fomichev <sdf@...ichev.me>,
	virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
	bpf@...r.kernel.org, Bui Quang Minh <minhquangbui99@...il.com>
Subject: Re: [PATCH net 1/3] virtio-net: make refill work a per receive queue
 work

On Fri, Dec 26, 2025 at 09:31:26AM +0800, Jason Wang wrote:
> On Fri, Dec 26, 2025 at 12:27 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> >
> > On Thu, Dec 25, 2025 at 03:33:29PM +0800, Jason Wang wrote:
> > > On Wed, Dec 24, 2025 at 9:48 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> > > >
> > > > On Wed, Dec 24, 2025 at 09:37:14AM +0800, Xuan Zhuo wrote:
> > > > >
> > > > > Hi Jason,
> > > > >
> > > > > I'm wondering why we even need this refill work. Why not simply let NAPI retry
> > > > > the refill on its next run if the refill fails? That would seem much simpler.
> > > > > This refill work complicates maintenance and often introduces a lot of
> > > > > concurrency issues and races.
> > > > >
> > > > > Thanks.
> > > >
> > > > refill work can refill from GFP_KERNEL, napi only from ATOMIC.
> > > >
> > > > And if GFP_ATOMIC failed, aggressively retrying might not be a great idea.
> > >
> > > Btw, I see some drivers are doing things as Xuan said. E.g
> > > mlx5e_napi_poll() did:
> > >
> > > busy |= INDIRECT_CALL_2(rq->post_wqes,
> > >                                 mlx5e_post_rx_mpwqes,
> > >                                 mlx5e_post_rx_wqes,
> > >
> > > ...
> > >
> > > if (busy) {
> > >          if (likely(mlx5e_channel_no_affinity_change(c))) {
> > >                 work_done = budget;
> > >                 goto out;
> > > ...
> >
> >
> > is busy a GFP_ATOMIC allocation failure?
> 
> Yes, and I think the logic here is to fallback to ksoftirqd if the
> allocation fails too much.
> 
> Thanks


True. I just don't know if this works better or worse than the
current design, but it is certainly simpler and we never actually
worried about the performance of the current one.


So you know, let's roll with this approach.

I do however ask that some testing is done on the patch forcing these OOM
situations just to see if we are missing something obvious.


the beauty is the patch can be very small:
1. patch 1 do not schedule refill ever, just retrigger napi
2. remove all the now dead code

this way patch 1 will be small and backportable to stable.






> >
> > > >
> > > > Not saying refill work is a great hack, but that is the reason for it.
> > > > --
> > > > MST
> > > >
> > >
> > > Thanks
> >


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ