[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260205122953.lscemcctayrvszdu@skbuf>
Date: Thu, 5 Feb 2026 14:29:53 +0200
From: Vladimir Oltean <vladimir.oltean@....com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Larysa Zaremba <larysa.zaremba@...el.com>, bpf@...r.kernel.org,
Claudiu Manoil <claudiu.manoil@....com>,
Wei Fang <wei.fang@....com>, Clark Wang <xiaoning.wang@....com>,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Stanislav Fomichev <sdf@...ichev.me>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>,
KP Singh <kpsingh@...nel.org>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, Simon Horman <horms@...nel.org>,
Shuah Khan <shuah@...nel.org>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
"Bastien Curutchet (eBPF Foundation)" <bastien.curutchet@...tlin.com>,
Tushar Vyavahare <tushar.vyavahare@...el.com>,
Jason Xing <kernelxing@...cent.com>,
Ricardo B. Marlière <rbm@...e.com>,
Eelco Chaudron <echaudro@...hat.com>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Toke Hoiland-Jorgensen <toke@...hat.com>, imx@...ts.linux.dev,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
intel-wired-lan@...ts.osuosl.org, linux-kselftest@...r.kernel.org,
Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Subject: Re: [PATCH bpf 6/6] net: enetc: use truesize as XDP RxQ info
frag_size
On Wed, Feb 04, 2026 at 05:34:01PM -0800, Jakub Kicinski wrote:
> On Thu, 5 Feb 2026 02:59:01 +0200 Vladimir Oltean wrote:
> > Thanks! This is an extremely subtle corner case. I appreciate the patch
> > and explanation.
> >
> > I did run tests on the blamed commit (which I still have), but to catch
> > a real issue in a meaningful way it would have been required to have a
> > program which calls bpf_xdp_adjust_tail() with a very large offset.
> > I'm noting that I'm seeing the WARN_ON() much easier after your fix, but
> > before, it was mostly inconsequential for practical cases.
> >
> > Namely, the ENETC truesize is 2048, and XDP_PACKET_HEADROOM is 256.
> > First buffers also contain the skb_shared_info (320 bytes), while
> > subsequent buffers don't.
>
> I can't wrap my head around this series, hope you can tell me where I'm
> going wrong. AFAICT enetc splits the page into two halves for small MTU.
>
> So we have
>
> | 2k | 2k |
> ----------------------------- -----------------------------
> | hroom | data | troom/shinfo | hroom | data | troom/shinfo |
> ----------------------------- -----------------------------
>
> If we attach the second chunk as frag well have:
> offset = 2k + hroom
> size = data.len
> But we use
> truesize / frag_size = 2k
> so
> tailroom = rxq->frag_size - skb_frag_size(frag) - skb_frag_off(frag);
> tailroom = 2k - data.len - 2k
> tailroom = -data.len
> WARN(tailroom < 0) -> yes
>
> The frag_size thing is unusable for any driver that doesn't hand out
> full pages to frags?
This is an excellent question.
Yes, you're right, bpf_xdp_frags_increase_tail() only has a 50% chance
of working - the paged data has to be in the first half of the page,
otherwise the tailroom calculations are not correct due to rxq->frag_size,
and the WARN_ON() will trigger.
The reason why I didn't notice this during my testing is stupid. I was
attaching the BPF program to the interface and then detaching it after
each test, and each test was sending less than the RX ring size (2048)
worth of packets. So all multi-buffer frames were using buffers which
were fresh out of enetc_setup_rxbdr() -> ... -> enetc_new_page() (first
halves) and never out of flipped pages (enetc_bulk_flip_buff()).
This seems to be a good reason to convert this driver to use page pool,
which I can look into. I'm not sure that there's anything that can be
done to make the rxq->frag_size mechanism compatible with the current
buffer allocation scheme.
Powered by blists - more mailing lists