lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMkBb+d_5dn6vdtSxPJ-HuUUL9uei65euSQfX3bXYm9RAw@mail.gmail.com>
Date: Sat, 31 Jan 2026 11:51:04 -0500
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Paul Moses <p@....org>
Cc: netdev@...r.kernel.org, xiyou.wangcong@...il.com, jiri@...nulli.us, 
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, 
	horms@...nel.org, linux-kernel@...r.kernel.org, stable@...r.kernel.org, 
	Po Liu <Po.Liu@....com>
Subject: Re: [PATCH net] net: sched: act_api: size RTM_GETACTION reply by fill size

.

On Fri, Jan 30, 2026 at 3:48 PM Paul Moses <p@....org> wrote:
>
> What version of act_gate.c are you currently testing?

I am running plain ubuntu on this machine using their shipped kernel 6.8.0.
But i did look at the latest kernel tree and the dumping code has not changed.
+Cc Po Liu who i believe added that code.

>Did you actually run the tests? “large dump” creates ONE action at base_index, with num_entries=100, then immediately does GETACTION. So “tc actions ls action gate | grep index | wc -l” won’t exercise this, because it only counts actions. It doesn’t amplify the per action dump size (the entry list does). It uses libmnl (mnl_socket_sendto / mnl_socket_recvfrom) with MNL_SOCKET_BUFFER_SIZE. There is no custom netlink handling. The failure is returned by the kernel before userspace parses anything. The dumps are transactional at the netlink level, but an individual action dump still has to fit in the skb backing that message.

Sorry - I am not running your code (didnt want to compile anything on
this machine), just plain tc and i have to admit I dont know much
about the mechanics or spec for gate, so my example is based on
something Po Liu posted, here's a script to add 100 entries:
---
for i in {1..100}; do
    echo "$i"
    tc actions add action gate clockid CLOCK_TAI sched-entry open
200000000 -1 8000000 sched-entry close 100000000 -1 -1
done
---

Then dumping:

$ sudo tc actions ls action gate | grep index
index 1 ref 1 bind 0
index 2 ref 1 bind 0
index 3 ref 1 bind 0
index 4 ref 1 bind 0
index 5 ref 1 bind 0
index 6 ref 1 bind 0
..
...
....
index 95 ref 1 bind 0
index 96 ref 1 bind 0
index 97 ref 1 bind 0
index 98 ref 1 bind 0
index 99 ref 1 bind 0
index 100 ref 1 bind 0
$


>
> look at af_netlink.c
>         /* NLMSG_GOODSIZE is small to avoid high order allocations being
>          * required, but it makes sense to _attempt_ a 32KiB allocation
>          * to reduce number of system calls on dump operations, if user
>          * ever provided a big enough buffer.
>          */
>          ...
>         /* Trim skb to allocated size. User is expected to provide buffer as
>          * large as max(min_dump_alloc, 32KiB (max_recvmsg_len capped at
>          * netlink_recvmsg())). dump will pack as many smaller messages as
>          * could fit within the allocated skb. skb is typically allocated
>          * with larger space than required (could be as much as near 2x the
>          * requested size with align to next power of 2 approach). Allowing
>          * dump to use the excess space makes it difficult for a user to have a
>          * reasonable static buffer based on the expected largest dump of a
>          * single netdev. The outcome is MSG_TRUNC error.
>          */
>
> This is where I am currently but I have seen these bugs appear throughout all my iterations including what's in the tree currently, if you show me better alternatives that solve my problems, I'll gladly accept.
> https://github.com/torvalds/linux/compare/master...jopamo:linux:net-stable-upstream-v4
>

I dont see a problem with "dump" as you seem to be suggesting. I asked
earlier if it is possible that you can create some single entry - not
100 as shown above that will consume more than NLMSG_GOODSIZE? My
limited knowledge is not helping me see such a scenario.

I looked at the transaction of how the 100 entries are dumped and i
see the following:
$ sudo tc actions ls action gate | grep total
total acts 12
total acts 12
total acts 76

User space received batches of 12, 12, and last one was 76 before it
received an empty message with NLMSG_DONE.

cheers,
jamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ