netdev - Re: [net-next,v25,4/6] eea: create/destroy rx,tx queues for netdevice open and stop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1770256081.5424275-1-xuanzhuo@linux.alibaba.com>
Date: Thu, 5 Feb 2026 09:48:01 +0800
From: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: lorenzo@...nel.org,
 andrew+netdev@...n.ch,
 pabeni@...hat.com,
 vadim.fedorenko@...ux.dev,
 davem@...emloft.net,
 guwen@...ux.alibaba.com,
 lulie@...ux.alibaba.com,
 hkallweit1@...il.com,
 edumazet@...gle.com,
 lukas.bulwahn@...hat.com,
 andrew@...n.ch,
 dong100@...se.com,
 dust.li@...ux.alibaba.com,
 netdev@...r.kernel.org
Subject: Re: [net-next,v25,4/6] eea: create/destroy rx,tx queues for netdevice open and stop

On Tue, 3 Feb 2026 20:12:37 -0800, Jakub Kicinski <kuba@...nel.org> wrote:
> On Tue,  3 Feb 2026 20:00:55 -0800 Jakub Kicinski wrote:
> > > +	err = enet_bind_new_q_and_cfg(enet, ctx);
> > > +	if (err) {
> > > +		netdev_err(enet->netdev,
> > > +			   "eea reset: bind new queues failed. err %d\n",
> > > +			   err);
> > > +
> > > +		return err;
> > > +	}
> >
> > When enet_bind_new_q_and_cfg() fails, what happens to the queues allocated
> > by eea_alloc_rxtx_q_mem() at line 289? They're now assigned to ctx->rx and
> > ctx->tx but haven't been bound to enet yet.
> >
> > After eea_netdev_stop() sets enet->started = false, a subsequent call to
> > eea_netdev_stop() will return early at line 228 without calling
> > eea_free_rxtx_q_mem(). If enet_bind_new_q_and_cfg() fails before binding,
> > the queues remain in ctx with no cleanup path.
> >
> > The comment suggests deferring cleanup to "normal NIC cleanup" but
> > eea_net_remove() doesn't call eea_free_rxtx_q_mem(), and future reset
> > attempts would allocate new queues without freeing these.
>
> I think AI is slightly confused here but so am I. I don't get where you
> free he previous resources in this flow. The "bind_new_q_and_cfg" just
> overrides stuff, who frees the old set of rings?
>
> Also as I already mentioned in previous manual review you are not
> pre-allocating enough. You should also request necessary extra IRQs
> _before_ you start tearing down the old state.

Regarding this point, I did some experimentation and believe there are several
issues. First, since request_irq requires passing the RX structure as callback
data, we cannot reuse the same IRQ directlyâ€”we must first free the existing
IRQ and then request a new one. To safely free an IRQ, the most reasonable
approach is to stop the backend NIC's operations before releasing the IRQ. This
constraint means we can only free the IRQ after tearing down the NIC. I've
reviewed implementations of other NIC drivers, and they generally follow similar
logic.

Moreover, I believe it would be quite difficult to achieve this with only minor
modifications. In the future, we might consider introducing a proxy or an
intermediate structure to serve as the IRQ callback data, which could enable IRQ
reuse. However, that would likely require more substantial changes. Given that
we've already done extensive review and refinement, I'd prefer to avoid
introducing major refactoring at this stage.

Thanks.



>
> > > +static struct sk_buff *eea_rx_build_split_hdr_skb(struct eea_net_rx *rx,
> > > +						  struct eea_rx_ctx *ctx)
> > > +{
> > > +	struct eea_rx_meta *meta = ctx->meta;
> > > +	struct sk_buff *skb;
> > > +	u32 truesize;
> > > +
> > > +	dma_sync_single_for_cpu(rx->enet->edev->dma_dev, meta->hdr_dma,
> > > +				ctx->hdr_len, DMA_FROM_DEVICE);
> > > +
> > > +	skb = napi_alloc_skb(&rx->napi, ctx->hdr_len);
> > > +	if (unlikely(!skb))
> > > +		return NULL;
> > > +
> > > +	truesize = meta->headroom + ctx->len;
> > > +
> > > +	skb_put_data(skb, ctx->meta->hdr_addr, ctx->hdr_len);
> > > +
> > > +	if (ctx->len) {
> > > +		skb_add_rx_frag(skb, 0, meta->page,
> > > +				meta->offset + meta->headroom,
> > > +				ctx->len, truesize);
> > > +
> > > +		eea_consume_rx_buffer(rx, meta, truesize);
> > > +	}
> >
> > Is the truesize calculation correct for split header mode? Looking at line
> > 255, truesize is calculated as meta->headroom + ctx->len.
> >
> > In eea_rx_post() at line 500, buffers are allocated with space for
> > [headroom][data][tailroom], where tailroom is typically 128 bytes for
> > skb_shared_info. The tailroom is reserved but not included in the truesize
> > calculation here.
> >
> > When eea_consume_rx_buffer() advances meta->offset by only
> > (headroom + data_len), the reserved tailroom space remains unconsumed. After
> > alignment in meta_align_offset(), the next fragment may overlap with the
> > previous fragment's tailroom space.
> >
> > Compare with the non-split header path in eea_rx_build_skb() at line 290,
> > which includes shinfo_size in truesize:
> >
> >     truesize = meta->headroom + ctx->len + shinfo_size;
> >
> > Should the split header path also include meta->tailroom or shinfo_size in
> > the truesize calculation?
>
> This one - I think the AI is just confused by how frags work.
>