[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260102180530.1559514-1-viswanathiyyappan@gmail.com>
Date: Fri, 2 Jan 2026 23:35:28 +0530
From: I Viswanath <viswanathiyyappan@...il.com>
To: edumazet@...gle.com,
andrew+netdev@...n.ch,
horms@...nel.org,
kuba@...nel.org,
pabeni@...hat.com,
mst@...hat.com,
eperezma@...hat.com,
jasowang@...hat.com,
xuanzhuo@...ux.alibaba.com
Cc: netdev@...r.kernel.org,
virtualization@...ts.linux.dev,
linux-kernel@...r.kernel.org,
I Viswanath <viswanathiyyappan@...il.com>
Subject: [PATCH net-next v7 0/2] net: Split ndo_set_rx_mode into snapshot and deferred write
This is an implementation of the idea provided by Jakub here
https://lore.kernel.org/netdev/20250923163727.5e97abdb@kernel.org/
ndo_set_rx_mode is problematic because it cannot sleep.
To address this, this series proposes dividing the concept of setting
rx_mode into 2 stages: snapshot and deferred I/O. To achieve this, we
change the semantics of set_rx_mode and add a new ndo write_rx_mode.
The new set_rx_mode will be responsible for customizing the rx_mode
snapshot which will be used by write_rx_mode to update the hardware
In brief, the new flow will look something like:
set_rx_mode():
ndo_set_rx_mode();
prepare_rx_mode();
write_rx_mode():
use_snapshot();
ndo_write_rx_mode();
write_rx_mode() is called from a work item and doesn't hold the
netif_addr_lock spin lock during ndo_write_rx_mode() making it sleepable
in that section.
This model should work correctly if the following conditions hold:
1. write_rx_mode should use the rx_mode set by the most recent
call to prepare_rx_mode() before its execution.
2. If a make_snapshot_ready call happens during execution of write_rx_mode,
write_rx_mode() should be rescheduled.
3. All calls to modify rx_mode should pass through the prepare_rx_mode +
schedule write_rx_mode() execution flow. netif_schedule_rx_mode_work()
has been implemented in core for this purpose.
1 and 2 are implemented in core
Drivers need to ensure 3 using netif_schedule_rx_mode_work()
To use this model, a driver needs to implement the
ndo_write_rx_mode callback, change the set_rx_mode callback
appropriately and replace all calls to modify rx mode with
netif_schedule_rx_mode_work()
Signed-off-by: I Viswanath <viswanathiyyappan@...il.com>
---
In v5, apart from the bug with netif_rx_mode_flush_work, netif_free_rx_mode_ctx() was problematic
because it needed to cancel and wait for the work to complete before freeing memory.
The problem was that the work needed to grab the RTNL lock while the RTNL lock was held as this function
was part of dev_close()
This means we are guaranteed a deadlock in case the work was pending.
cancelling the work should be done in a context that doesn't hold the RTNL lock. The only existing
function in the teardown path that did this was free_netdev and it isn't ideal to do the cleanup there.
My solution is to introduce a new struct netif_cleanup_work and a new net_device member cleanup_work.
I am not sure if there is a better solution than this.
cleanup_work will be a work item scheduled by dev_close() that will execute the cleanup functions that
need a RTNL lock free context.
v1:
Link: https://lore.kernel.org/netdev/20251020134857.5820-1-viswanathiyyappan@gmail.com/
v2:
- Exported set_and_schedule_rx_config as a symbol for use in modules
- Fixed incorrect cleanup for the case of rx_work alloc failing in alloc_netdev_mqs
- Removed the locked version (cp_set_rx_mode) and renamed __cp_set_rx_mode to cp_set_rx_mode
Link: https://lore.kernel.org/netdev/20251026175445.1519537-1-viswanathiyyappan@gmail.com/
v3:
- Added RFT tag
- Corrected mangled patch
Link: https://lore.kernel.org/netdev/20251028174222.1739954-1-viswanathiyyappan@gmail.com/
v4:
- Completely reworked the snapshot mechanism as per v3 comments
- Implemented the callback for virtio-net instead of 8139cp driver
- Removed RFC tag
Link: https://lore.kernel.org/netdev/20251118164333.24842-1-viswanathiyyappan@gmail.com/
v5:
- Fix broken code and titles
- Remove RFT tag
Link: https://lore.kernel.org/netdev/20251120141354.355059-1-viswanathiyyappan@gmail.com/
v6:
- Added struct netif_deferred_work_cleanup and members needs_deferred_cleanup and deferred_work_cleanup in net_device
- Moved out ctrl bits from netif_rx_mode_config to netif_rx_mode_work_ctx
Link: https://lore.kernel.org/netdev/20251227174225.699975-1-viswanathiyyappan@gmail.com/
v7:
- Improved function, enum and struct names
I Viswanath (2):
net: refactor set_rx_mode into snapshot and deferred I/O
virtio-net: Implement ndo_write_rx_mode callback
drivers/net/virtio_net.c | 55 +++-----
include/linux/netdevice.h | 111 +++++++++++++++-
net/core/dev.c | 264 +++++++++++++++++++++++++++++++++++++-
3 files changed, 389 insertions(+), 41 deletions(-)
--
2.47.3
Powered by blists - more mailing lists