[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <willemdebruijn.kernel.24bd73d3718ec@gmail.com>
Date: Tue, 18 Nov 2025 09:13:26 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jakub Kicinski <kuba@...nel.org>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: davem@...emloft.net,
netdev@...r.kernel.org,
edumazet@...gle.com,
pabeni@...hat.com,
andrew+netdev@...n.ch,
horms@...nel.org,
shuah@...nel.org,
sdf@...ichev.me,
krakauer@...gle.com,
linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net-next 00/12] selftests: drv-net: convert GRO and
Toeplitz tests to work for drivers in NIPA
Jakub Kicinski wrote:
> On Mon, 17 Nov 2025 21:11:31 -0500 Willem de Bruijn wrote:
> > > Note that neither GRO nor the Toeplitz test fully passes for me on
> > > any HW I have access to. But this is unrelated to the conversion.
> >
> > You observed the same failures with the old and new tests? Are they
> > deterministic failures or flakes.
>
> Deterministic for Toeplitz - all NICs I have calculate the Rx
> hash the same as the test for at least one of traffic types.
> But none of them exactly as the test is expecting.
> One IIRC also uses non-standard RSS indir table pattern by default.
> The indirection table will be a trivial fix.
Ugh yes we've had a bug open for ages internally to add indirection
table parsing to the test:
The (upstream) RSS test is too simplistic: it calculates
queue_id = hash % num_queues
Real RSS uses an indirection table:
queue_id = indir_table[hash % indir_table_len]
> For HW-GRO I investigated less closely I mostly focused on making sure
> netdevsim is solid as a replacement for veth. There was more flakiness
> on HW (admittedly I was running inter-dc-building). But the failures
> looked rather sus - the test was reporting that packets which were
> not supposed to be coalesced got coalesced.
The reverse is a known cause of flakiness, due to the context closure
timer firing. But unexpected coalescing definitely seems suspicious.
> BTW it's slightly inconvenient that we disable HW-GRO when normal GRO
> is disabled :( Makes it quite hard to run the test to check device
> behavior. My current plan is to rely on device counters to check
> whether traffic is getting coalesced but better ideas most welcome :(
We probably have to maintain this behavior, but could add an override
to enable only HW-GRO.
Alternatively, just for measurement, a bpf fentry program. But that is
a lot more complex than reading the counters, which is sufficient
signal.
> > > This series is not making any real functional changes to the tests,
> > > it is limited to improving the "test harness" scripts.
> >
> > No significant actionable comments, just a few trivial typos.
>
> Thanks!
Powered by blists - more mailing lists