lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMnseQw6H+a4wzhg7BkPJraFwN-=2x4FOSOUp5f7=XbyaQ@mail.gmail.com>
Date: Sat, 31 Jan 2026 12:14:01 -0500
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Paul Moses <p@....org>
Cc: netdev@...r.kernel.org, xiyou.wangcong@...il.com, jiri@...nulli.us, 
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, 
	horms@...nel.org, linux-kernel@...r.kernel.org, stable@...r.kernel.org, 
	Po Liu <Po.Liu@....com>
Subject: Re: [PATCH net] net: sched: act_api: size RTM_GETACTION reply by fill size

On Sat, Jan 31, 2026 at 11:51 AM Jamal Hadi Salim <jhs@...atatu.com> wrote:
>
> .
>
> On Fri, Jan 30, 2026 at 3:48 PM Paul Moses <p@....org> wrote:
> >
> > What version of act_gate.c are you currently testing?
>
> I am running plain ubuntu on this machine using their shipped kernel 6.8.0.
> But i did look at the latest kernel tree and the dumping code has not changed.
> +Cc Po Liu who i believe added that code.
>
> >Did you actually run the tests? “large dump” creates ONE action at base_index, with num_entries=100, then immediately does GETACTION. So “tc actions ls action gate | grep index | wc -l” won’t exercise this, because it only counts actions. It doesn’t amplify the per action dump size (the entry list does). It uses libmnl (mnl_socket_sendto / mnl_socket_recvfrom) with MNL_SOCKET_BUFFER_SIZE. There is no custom netlink handling. The failure is returned by the kernel before userspace parses anything. The dumps are transactional at the netlink level, but an individual action dump still has to fit in the skb backing that message.
>
> Sorry - I am not running your code (didnt want to compile anything on
> this machine), just plain tc and i have to admit I dont know much
> about the mechanics or spec for gate, so my example is based on
> something Po Liu posted, here's a script to add 100 entries:
> ---
> for i in {1..100}; do
>     echo "$i"
>     tc actions add action gate clockid CLOCK_TAI sched-entry open
> 200000000 -1 8000000 sched-entry close 100000000 -1 -1
> done
> ---
>
> Then dumping:
>
> $ sudo tc actions ls action gate | grep index
> index 1 ref 1 bind 0
> index 2 ref 1 bind 0
> index 3 ref 1 bind 0
> index 4 ref 1 bind 0
> index 5 ref 1 bind 0
> index 6 ref 1 bind 0
> ..
> ...
> ....
> index 95 ref 1 bind 0
> index 96 ref 1 bind 0
> index 97 ref 1 bind 0
> index 98 ref 1 bind 0
> index 99 ref 1 bind 0
> index 100 ref 1 bind 0
> $
>
>
> >
> > look at af_netlink.c
> >         /* NLMSG_GOODSIZE is small to avoid high order allocations being
> >          * required, but it makes sense to _attempt_ a 32KiB allocation
> >          * to reduce number of system calls on dump operations, if user
> >          * ever provided a big enough buffer.
> >          */
> >          ...
> >         /* Trim skb to allocated size. User is expected to provide buffer as
> >          * large as max(min_dump_alloc, 32KiB (max_recvmsg_len capped at
> >          * netlink_recvmsg())). dump will pack as many smaller messages as
> >          * could fit within the allocated skb. skb is typically allocated
> >          * with larger space than required (could be as much as near 2x the
> >          * requested size with align to next power of 2 approach). Allowing
> >          * dump to use the excess space makes it difficult for a user to have a
> >          * reasonable static buffer based on the expected largest dump of a
> >          * single netdev. The outcome is MSG_TRUNC error.
> >          */
> >
> > This is where I am currently but I have seen these bugs appear throughout all my iterations including what's in the tree currently, if you show me better alternatives that solve my problems, I'll gladly accept.
> > https://github.com/torvalds/linux/compare/master...jopamo:linux:net-stable-upstream-v4
> >
>
> I dont see a problem with "dump" as you seem to be suggesting. I asked
> earlier if it is possible that you can create some single entry - not
> 100 as shown above that will consume more than NLMSG_GOODSIZE? My
> limited knowledge is not helping me see such a scenario.

Aha. I think there is a terminology mixup ;->
"dump" (a very unfortunate use of that word in the netlink world ;->)
is a very special word. So when you take a dump in this world you are
GETing a whole table. In this case all the gate actions.

If i am not mistaken in your case this is not a dump - rather, you are
CREATing a single entry which is bigger than NLMSG_GOODSIZE as i
suspected. I dont believe iproute2 will allow you to do that.
What's happening then is that the generated netlink event notification
for that single entry is too big to fit in NLMSG_GOODSIZE.
Let me try to craft something for that...

cheers,
jamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ