[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250417072806.18660-1-minhquangbui99@gmail.com>
Date: Thu, 17 Apr 2025 14:28:02 +0700
From: Bui Quang Minh <minhquangbui99@...il.com>
To: virtualization@...ts.linux.dev
Cc: "Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
Andrew Lunn <andrew+netdev@...n.ch>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Eugenio Pérez <eperezma@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
netdev@...r.kernel.org,
linux-kernel@...r.kernel.org,
bpf@...r.kernel.org,
Bui Quang Minh <minhquangbui99@...il.com>
Subject: [PATCH v4 0/4] virtio-net: disable delayed refill when pausing rx
Hi everyone,
This series tries to fix a deadlock in virtio-net when binding/unbinding
XDP program, XDP socket or resizing the rx queue.
When pausing rx (e.g. set up xdp, xsk pool, rx resize), we call
napi_disable() on the receive queue's napi. In delayed refill_work, it
also calls napi_disable() on the receive queue's napi. When
napi_disable() is called on an already disabled napi, it will sleep in
napi_disable_locked while still holding the netdev_lock. As a result,
later napi_enable gets stuck too as it cannot acquire the netdev_lock.
This leads to refill_work and the pause-then-resume tx are stuck
altogether.
This scenario can be reproducible by binding a XDP socket to virtio-net
interface without setting up the fill ring. As a result, try_fill_recv
will fail until the fill ring is set up and refill_work is scheduled.
This fix adds virtnet_rx_(pause/resume)_all helpers and fixes up the
virtnet_rx_resume to disable future and cancel all inflights delayed
refill_work before calling napi_disable() to pause the rx.
Version 4 changes:
- Add force zerocopy mode to xdp_helper
- Make virtio_net selftest use force zerocopy mode
- Move virtio_net selftest to drivers/net/hw
Version 3 changes:
- Patch 1: refactor to avoid code duplication
Version 2 changes:
- Add selftest for deadlock scenario
Thanks,
Quang Minh.
Bui Quang Minh (4):
virtio-net: disable delayed refill when pausing rx
selftests: net: move xdp_helper to net/lib
selftests: net: add flag to force zerocopy mode in xdp_helper
selftests: net: add a virtio_net deadlock selftest
drivers/net/virtio_net.c | 69 +++++++++++++++----
tools/testing/selftests/drivers/net/Makefile | 2 -
.../testing/selftests/drivers/net/hw/Makefile | 1 +
.../selftests/drivers/net/hw/virtio_net.py | 65 +++++++++++++++++
tools/testing/selftests/drivers/net/queues.py | 4 +-
tools/testing/selftests/net/lib/.gitignore | 1 +
tools/testing/selftests/net/lib/Makefile | 1 +
.../{drivers/net => net/lib}/xdp_helper.c | 13 +++-
8 files changed, 138 insertions(+), 18 deletions(-)
create mode 100755 tools/testing/selftests/drivers/net/hw/virtio_net.py
rename tools/testing/selftests/{drivers/net => net/lib}/xdp_helper.c (90%)
--
2.43.0
Powered by blists - more mailing lists