[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <pXV1wsavqcYDq5HfAVaW_gMoTITR9M0PBWKhnz9n6VHYxhW56DQU7qfCEoaYcCixz4iqrj31Mt9vL9bHqTNGygLK5pYvyw1z3san5ndlkkQ=@1g4.org>
Date: Fri, 30 Jan 2026 20:47:52 +0000
From: Paul Moses <p@....org>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: netdev@...r.kernel.org, xiyou.wangcong@...il.com, jiri@...nulli.us, davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, horms@...nel.org, linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH net] net: sched: act_api: size RTM_GETACTION reply by fill size
What version of act_gate.c are you currently testing? Did you actually run the tests? “large dump” creates ONE action at base_index, with num_entries=100, then immediately does GETACTION. So “tc actions ls action gate | grep index | wc -l” won’t exercise this, because it only counts actions. It doesn’t amplify the per action dump size (the entry list does). It uses libmnl (mnl_socket_sendto / mnl_socket_recvfrom) with MNL_SOCKET_BUFFER_SIZE. There is no custom netlink handling. The failure is returned by the kernel before userspace parses anything. The dumps are transactional at the netlink level, but an individual action dump still has to fit in the skb backing that message.
look at af_netlink.c
/* NLMSG_GOODSIZE is small to avoid high order allocations being
* required, but it makes sense to _attempt_ a 32KiB allocation
* to reduce number of system calls on dump operations, if user
* ever provided a big enough buffer.
*/
...
/* Trim skb to allocated size. User is expected to provide buffer as
* large as max(min_dump_alloc, 32KiB (max_recvmsg_len capped at
* netlink_recvmsg())). dump will pack as many smaller messages as
* could fit within the allocated skb. skb is typically allocated
* with larger space than required (could be as much as near 2x the
* requested size with align to next power of 2 approach). Allowing
* dump to use the excess space makes it difficult for a user to have a
* reasonable static buffer based on the expected largest dump of a
* single netdev. The outcome is MSG_TRUNC error.
*/
This is where I am currently but I have seen these bugs appear throughout all my iterations including what's in the tree currently, if you show me better alternatives that solve my problems, I'll gladly accept.
https://github.com/torvalds/linux/compare/master...jopamo:linux:net-stable-upstream-v4
gatebench --selftest
Configuration:
Iterations per run: 1000
Warmup iterations: 100
Runs: 5
Gate entries: 10
Gate interval: 1000000 ns
Starting index: 1000
CPU pinning: no
Netlink timeout: 1000 ms
Selftest: yes
JSON output: no
Sampling: no
Clock ID: 11
Base time: 0 ns
Cycle time: 0 ns
Cycle time ext: 0 ns
Environment:
Kernel: Linux 6.18.7 x86_64
Current CPU: 7
Clock source: CLOCK_MONOTONIC_RAW
Running selftests...
Running 20 selftests...
create missing parms PASS (got -22)
create missing entry list PASS (got -22)
create empty entry list PASS (got -22)
create zero interval PASS (got -22)
create bad clockid PASS (got -22)
replace without existing PASS (got 0)
duplicate create PASS (got -17)
dump correctness PASS (got 0)
replace persistence PASS (got 0)
clockid variants PASS (got 0)
cycle time derivation PASS (got 0)
cycle time extension parsing PASS (got 0)
replace preserve schedule PASS (got 0)
base time update PASS (got 0)
multiple entries PASS (got 0)
malformed nesting PASS (got -22)
bad attribute size PASS (got -22)
param validation PASS (got 0)
replace invalid PASS (got 0)
large dump DEBUG: msg->len = 3112
PASS (got 0)
Selftests: 20/20 passed
Selftests passed
Running benchmark...
Run 1/5... done (311721.5 ops/sec)
Run 2/5... done (321045.7 ops/sec)
Run 3/5... done (336402.3 ops/sec)
Run 4/5... done (338419.7 ops/sec)
Run 5/5... done (316618.9 ops/sec)
Benchmark completed successfully
Powered by blists - more mailing lists