[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1770205572.2731752-1-xuanzhuo@linux.alibaba.com>
Date: Wed, 4 Feb 2026 19:46:12 +0800
From: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: lorenzo@...nel.org,
andrew+netdev@...n.ch,
pabeni@...hat.com,
vadim.fedorenko@...ux.dev,
davem@...emloft.net,
guwen@...ux.alibaba.com,
lulie@...ux.alibaba.com,
hkallweit1@...il.com,
edumazet@...gle.com,
lukas.bulwahn@...hat.com,
andrew@...n.ch,
dong100@...se.com,
dust.li@...ux.alibaba.com,
netdev@...r.kernel.org
Subject: Re: [net-next,v25,4/6] eea: create/destroy rx,tx queues for netdevice open and stop
On Tue, 3 Feb 2026 20:12:37 -0800, Jakub Kicinski <kuba@...nel.org> wrote:
> On Tue, 3 Feb 2026 20:00:55 -0800 Jakub Kicinski wrote:
> > > + err = enet_bind_new_q_and_cfg(enet, ctx);
> > > + if (err) {
> > > + netdev_err(enet->netdev,
> > > + "eea reset: bind new queues failed. err %d\n",
> > > + err);
> > > +
> > > + return err;
> > > + }
> >
> > When enet_bind_new_q_and_cfg() fails, what happens to the queues allocated
> > by eea_alloc_rxtx_q_mem() at line 289? They're now assigned to ctx->rx and
> > ctx->tx but haven't been bound to enet yet.
> >
> > After eea_netdev_stop() sets enet->started = false, a subsequent call to
> > eea_netdev_stop() will return early at line 228 without calling
> > eea_free_rxtx_q_mem(). If enet_bind_new_q_and_cfg() fails before binding,
> > the queues remain in ctx with no cleanup path.
> >
> > The comment suggests deferring cleanup to "normal NIC cleanup" but
> > eea_net_remove() doesn't call eea_free_rxtx_q_mem(), and future reset
> > attempts would allocate new queues without freeing these.
>
> I think AI is slightly confused here but so am I. I don't get where you
> free he previous resources in this flow. The "bind_new_q_and_cfg" just
> overrides stuff, who frees the old set of rings?
err = eea_alloc_rxtx_q_mem(ctx);
if (err) {
netdev_warn(enet->netdev,
"eea reset: alloc q failed. stop reset. err %d\n",
err);
return err;
}
eea_netdev_stop(enet->netdev); <---- here call eea_free_rxtx_q_mem
err = enet_bind_new_q_and_cfg(enet, ctx);
if (err) {
netdev_err(enet->netdev,
"eea reset: bind new queues failed. err %d\n",
err);
return err;
}
err = eea_active_ring_and_irq(enet);
if (err) {
netdev_err(enet->netdev,
"eea reset: active new ring and irq failed. err %d\n",
err);
return err;
}
eea_start_rxtx(enet->netdev);
>
> Also as I already mentioned in previous manual review you are not
> pre-allocating enough. You should also request necessary extra IRQs
> _before_ you start tearing down the old state.
I might have misunderstood your point; I will implement it that way in the next
version.
Thanks.
>
> > > +static struct sk_buff *eea_rx_build_split_hdr_skb(struct eea_net_rx *rx,
> > > + struct eea_rx_ctx *ctx)
> > > +{
> > > + struct eea_rx_meta *meta = ctx->meta;
> > > + struct sk_buff *skb;
> > > + u32 truesize;
> > > +
> > > + dma_sync_single_for_cpu(rx->enet->edev->dma_dev, meta->hdr_dma,
> > > + ctx->hdr_len, DMA_FROM_DEVICE);
> > > +
> > > + skb = napi_alloc_skb(&rx->napi, ctx->hdr_len);
> > > + if (unlikely(!skb))
> > > + return NULL;
> > > +
> > > + truesize = meta->headroom + ctx->len;
> > > +
> > > + skb_put_data(skb, ctx->meta->hdr_addr, ctx->hdr_len);
> > > +
> > > + if (ctx->len) {
> > > + skb_add_rx_frag(skb, 0, meta->page,
> > > + meta->offset + meta->headroom,
> > > + ctx->len, truesize);
> > > +
> > > + eea_consume_rx_buffer(rx, meta, truesize);
> > > + }
> >
> > Is the truesize calculation correct for split header mode? Looking at line
> > 255, truesize is calculated as meta->headroom + ctx->len.
> >
> > In eea_rx_post() at line 500, buffers are allocated with space for
> > [headroom][data][tailroom], where tailroom is typically 128 bytes for
> > skb_shared_info. The tailroom is reserved but not included in the truesize
> > calculation here.
> >
> > When eea_consume_rx_buffer() advances meta->offset by only
> > (headroom + data_len), the reserved tailroom space remains unconsumed. After
> > alignment in meta_align_offset(), the next fragment may overlap with the
> > previous fragment's tailroom space.
> >
> > Compare with the non-split header path in eea_rx_build_skb() at line 290,
> > which includes shinfo_size in truesize:
> >
> > truesize = meta->headroom + ctx->len + shinfo_size;
> >
> > Should the split header path also include meta->tailroom or shinfo_size in
> > the truesize calculation?
>
> This one - I think the AI is just confused by how frags work.
>
Powered by blists - more mailing lists