lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1492609668.10587.164.camel@edumazet-glaptop3.roam.corp.google.com>
Date:   Wed, 19 Apr 2017 06:47:48 -0700
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Jamal Hadi Salim <jhs@...atatu.com>
Cc:     davem@...emloft.net, netdev@...r.kernel.org, jiri@...nulli.us,
        xiyou.wangcong@...il.com
Subject: Re: [PATCH net-next 1/2 v2] net sched actions: dump more than
 TCA_ACT_MAX_PRIO actions per batch

On Wed, 2017-04-19 at 07:24 -0400, Jamal Hadi Salim wrote:
> On 17-04-18 11:17 PM, Eric Dumazet wrote:
> > On Tue, 2017-04-18 at 22:32 -0400, Jamal Hadi Salim wrote:
> >> On 17-04-18 09:49 PM, Eric Dumazet wrote:
> >>> On Tue, 2017-04-18 at 21:14 -0400, Jamal Hadi Salim wrote:
> 
> >>
> >> Make sense?
> >
> > What if we have 1024 actions, and user provides a 4KB buffer ?
> >
> 
> No problem - we will fit as many per batch to consume 4KB
> and will send them as long as user calls recvmsg.
> 
> > Normally multiple recvmsg() calls would be needed, but I do not see how
> > the nla_put_u32(skb, TCAA_ACT_COUNT, cb->args[1]) can always succeed.
> 
> Oh, I see the cross-talk Eric;->
> We dont pack the actions in TCAA_ACT_COUNT - we put them in
> TCAA_ACT_TAB.
> 
> Here's some strace capture that best describes what happened before
> and after which i hope will make sense. Granted tc uses 32KB from
> user space and not 4KB you mention.
> 
> We have 400 actions in the kernel at this point:
> 
> tc with no changes (doesnt for this large dump):
> ---
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{"\240\16\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\204\16\1\0\200\0\0\0\t\0\1\0"..., 
> 32768,g}], msg_controllen=0, msg_flags=0}, 0) = 3744
> 
> ===> total actions dumped 29
> 
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{" 
> \20\0\0002\0\2\0s6\367XI\3\0\0\0\0\0\0\4\20\1\0\200\0\0\0\t\0\1\0"..., 
> 32768}], msg_controllen=0, msg_flags=0}, 0) = 4128
> 
> ==> total actions dumped 32
> 
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{" 
> \20\0\0002\0\2\0s6\367XI\3\0\0\0\0\0\0\4\20\1\0\200\0\0\0\t\0\1\0"..., 
> 32768}], msg_controllen=0, msg_flags=0}, 0) = 4128
> 
> ==> total actions dumped 32
> ....
> .....
> .........
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{"\24\0\0\0\3\0\2\0s6\367XI\3\0\0\0\0\0\0", 32768}], 
> msg_controllen=0, msg_flags=0}, 0) = 20
> -----
> 
> Goes on a few times until we get all 400 entries - last recvmsg (with
> 20B) has no actions and indicates dump is complete.
> Issue: The kernel is refusing to add more than 32 entries in the skb
> even though we get allocated 32KB for the skb.
> 
> So now lets see what happens with this change:
> ------
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{"\240\16\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\204\16\1\0\200\0\0\0\t\0\1\0"..., 
> 32768,g}], msg_controllen=0, msg_flags=0}, 0) = 3744
> 
> ==> total actions dumped 29
> 
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{"\240~\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\204~\1\0\200\0\0\0\t\0\1\0"..., 
> 32768,g}], msg_controllen=0, msg_flags=0}, 0) = 32416
> 
> ==> total actions dumped 253
> 
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{" 
> ;\0\0002\0\2\0^6\367XE\3\0\0\0\0\0\0\4;\1\0\200\0\0\0\t\0\1\0"..., 
> 32768,g}], msg_controllen=0, msg_flags=0}, 0) = 15136
> 
> ==> total actions dumped 118
> 
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
> msg_iov(1)=[{"\24\0\0\0\3\0\2\0s6\367XI\3\0\0\0\0\0\0", 32768}], 
> msg_controllen=0, msg_flags=0}, 0) = 20
> 
> --------
> 
> We got all 400 in 3 requests. Imagine what we have to deal with
> when we have 2M actions (the improvement is about 10x).

Just because your strace works with the size of 32768 does not mean it
will work in the future.

Please read again my question.

Try to _not_ use 32768 bytes for the recvmsg() sizes, but 4KB

You pack XXX actions until 4KB skb is full.

Then code does :

nla_put_u32(skb, TCAA_ACT_COUNT, cb->args[1])

This might fail, then you 

goto out_module_put;


Then we are stuck ?

What am I missing ?




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ