[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251104164625.5a18db43@kernel.org>
Date: Tue, 4 Nov 2025 16:46:25 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: I Viswanath <viswanathiyyappan@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
horms@...nel.org, sdf@...ichev.me, kuniyu@...gle.com, ahmed.zaki@...el.com,
aleksander.lobakin@...el.com, jacob.e.keller@...el.com,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
skhan@...uxfoundation.org, linux-kernel-mentees@...ts.linux.dev,
david.hunter.linux@...il.com, khalid@...nel.org
Subject: Re: [RFC/RFT PATCH net-next v3 1/2] net: Add ndo_write_rx_config
and helper structs and functions:
On Tue, 4 Nov 2025 22:13:49 +0530 I Viswanath wrote:
> On Fri, 31 Oct 2025 at 07:50, Jakub Kicinski <kuba@...nel.org> wrote:
> > The driver you picked is relatively trivial, advanced drivers need
> > to sync longer lists of mcast / ucast addresses. Bulk of the complexity
> > is in keeping those lists. Simple
> >
> > *rx_config = *(config_ptr);
> >
> > assignment is not enough.
>
> Apologies, I had the wrong mental model of the snapshot.
>
> From what I understand, the snapshot should look something like
>
> struct netif_rx_config {
> char *uc_addrs; // of size uc_count * dev->addr_len
> char *mc_addrs; // of size mc_count * dev->addr_len
> int uc_count;
> int mc_count;
> bool multi_en, promisc_en, vlan_en;
> void *device_specific_config;
> }
> Correct me if I have missed anything
>
> Does the following pseudocode/skeleton make sense?
>
> update_config() will be called at end of set_rx_mode()
>
> read_config() is execute_write_rx_config() and do_io() is
> dev->netdev_ops->ndo_write_rx_config() named that way
> for consistency (since read/update)
>
> atomic_t cfg_in_use = ATOMIC_INIT(false);
> atomic_t cfg_update_pending = ATOMIC_INIT(false);
>
> struct netif_rx_config *active, *staged;
>
> void update_config()
> {
> int was_config_pending = atomic_xchg(&cfg_update_pending, false);
>
> // If prepare_config fails, it leaves staged untouched
> // So, we check for and apply if pending update
> int rc = prepare_config(&staged);
> if (rc && !was_config_pending)
> return;
>
> if (atomic_read(&cfg_in_use)) {
> atomic_set(&cfg_update_pending, true);
> return;
> }
> swap(active, staged);
> }
>
> void read_config()
> {
> atomic_set(&cfg_in_use, true);
> do_io(active);
> atomic_set(&cfg_in_use, false);
>
> // To account for the edge case where update_config() is called
> // during the execution of read_config() and there are no subsequent
> // calls to update_config()
> if (atomic_xchg(&cfg_update_pending, false))
> swap(active, staged);
> }
I wouldn't use atomic flags. IIRC ndo_set_rx_mode is called under
netif_addr_lock_bh(), so we can reuse that lock, have update_config()
assume ownership of the pending config and update it directly.
And read_config() (which IIUC runs from a wq) can take that lock
briefly, and swap which config is pending.
> >The driver needs to know old and new entries
> > and send ADD/DEL commands to FW. Converting virtio_net would be better,
> > but it does one huge dump which is also not representative of most
> > advanced NICs.
>
> We can definitely do this in prepare_config()
> Speaking of which, How big can uc_count and mc_count be?
>
> Would krealloc(buffer, uc_count * dev->addr_len, GFP_ATOMIC) be a good idea?
Not sure about the max value but I'd think low thousands is probably
a good target. IOW yes, I think one linear buffer may be a concern.
I'd think order 1 allocation may be fine tho..
> Well, virtio-net does kmalloc((uc_count + mc_count) * ETH_ALEN) + ...,
> GFP_ATOMIC),
> so this shouldn't introduce any new failures for virtio-net
Right but IDK if virtio is used on systems with the same sort of scale
as a large physical function driver..
> > Let's only allocate any extra state if driver has the NDO
> > We need to shut down sooner, some time between ndo_stop and ndo_uninit
>
> Would it make sense to move init (if ndo exists) and cleanup to
> __dev_open and __dev_close?
Yes, indeed.
Powered by blists - more mailing lists