lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Apr 2017 07:24:26 -0400
From:   Jamal Hadi Salim <jhs@...atatu.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     davem@...emloft.net, netdev@...r.kernel.org, jiri@...nulli.us,
        xiyou.wangcong@...il.com
Subject: Re: [PATCH net-next 1/2 v2] net sched actions: dump more than
 TCA_ACT_MAX_PRIO actions per batch

On 17-04-18 11:17 PM, Eric Dumazet wrote:
> On Tue, 2017-04-18 at 22:32 -0400, Jamal Hadi Salim wrote:
>> On 17-04-18 09:49 PM, Eric Dumazet wrote:
>>> On Tue, 2017-04-18 at 21:14 -0400, Jamal Hadi Salim wrote:

>>
>> Make sense?
>
> What if we have 1024 actions, and user provides a 4KB buffer ?
>

No problem - we will fit as many per batch to consume 4KB
and will send them as long as user calls recvmsg.

> Normally multiple recvmsg() calls would be needed, but I do not see how
> the nla_put_u32(skb, TCAA_ACT_COUNT, cb->args[1]) can always succeed.

Oh, I see the cross-talk Eric;->
We dont pack the actions in TCAA_ACT_COUNT - we put them in
TCAA_ACT_TAB.

Here's some strace capture that best describes what happened before
and after which i hope will make sense. Granted tc uses 32KB from
user space and not 4KB you mention.

We have 400 actions in the kernel at this point:

tc with no changes (doesnt for this large dump):
---
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\240\16\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\204\16\1\0\200\0\0\0\t\0\1\0"..., 
32768,g}], msg_controllen=0, msg_flags=0}, 0) = 3744

===> total actions dumped 29

recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{" 
\20\0\0002\0\2\0s6\367XI\3\0\0\0\0\0\0\4\20\1\0\200\0\0\0\t\0\1\0"..., 
32768}], msg_controllen=0, msg_flags=0}, 0) = 4128

==> total actions dumped 32

recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{" 
\20\0\0002\0\2\0s6\367XI\3\0\0\0\0\0\0\4\20\1\0\200\0\0\0\t\0\1\0"..., 
32768}], msg_controllen=0, msg_flags=0}, 0) = 4128

==> total actions dumped 32
....
.....
.........
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\24\0\0\0\3\0\2\0s6\367XI\3\0\0\0\0\0\0", 32768}], 
msg_controllen=0, msg_flags=0}, 0) = 20
-----

Goes on a few times until we get all 400 entries - last recvmsg (with
20B) has no actions and indicates dump is complete.
Issue: The kernel is refusing to add more than 32 entries in the skb
even though we get allocated 32KB for the skb.

So now lets see what happens with this change:
------
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\240\16\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\204\16\1\0\200\0\0\0\t\0\1\0"..., 
32768,g}], msg_controllen=0, msg_flags=0}, 0) = 3744

==> total actions dumped 29

recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\240~\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\204~\1\0\200\0\0\0\t\0\1\0"..., 
32768,g}], msg_controllen=0, msg_flags=0}, 0) = 32416

==> total actions dumped 253

recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{" 
;\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\4;\1\0\200\0\0\0\t\0\1\0"..., 
32768,g}], msg_controllen=0, msg_flags=0}, 0) = 15136

==> total actions dumped 118

recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\24\0\0\0\3\0\2\0s6\367XI\3\0\0\0\0\0\0", 32768}], 
msg_controllen=0, msg_flags=0}, 0) = 20

--------

We got all 400 in 3 requests. Imagine what we have to deal with
when we have 2M actions (the improvement is about 10x).

cheers,
jamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ