netdev - Re: [PATCH net-next v8 2/3] net sched actions: dump more than TCA_ACT_MAX

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <10fe2c22-8e76-543e-dd24-ddce5813ab69@mojatatu.com>
Date:   Wed, 26 Apr 2017 09:14:38 -0400
From:   Jamal Hadi Salim <jhs@...atatu.com>
To:     Jiri Pirko <jiri@...nulli.us>
Cc:     davem@...emloft.net, xiyou.wangcong@...il.com,
        eric.dumazet@...il.com, netdev@...r.kernel.org,
        Simon Horman <simon.horman@...ronome.com>,
        Benjamin LaHaise <bcrl@...ck.org>
Subject: Re: [PATCH net-next v8 2/3] net sched actions: dump more than
 TCA_ACT_MAX_PRIO actions per batch

On 17-04-26 08:08 AM, Jiri Pirko wrote:
> Wed, Apr 26, 2017 at 01:48:29PM CEST, jhs@...atatu.com wrote:
>> On 17-04-26 02:19 AM, Jiri Pirko wrote:
>>> Tue, Apr 25, 2017 at 10:29:40PM CEST, jhs@...atatu.com wrote:
>>>> On 17-04-25 12:04 PM, Jiri Pirko wrote:

>> I have experience with dealing with a massive amount of various dumps
>> and (batch) sets and it always boils down to one thing:
>> _how much data is exchanged between user and kernel_
>> 3 flags encoded as bits in a u32 attribute cost 64 bits.
>> Encoded separately cost 3x that.
>>
>> Believe me, it _does make a difference_ in performance.
>>
>> My least favorite subsystem is bridge. The bridge code has
>> tons of flags in those entries that are sent to/from kernel as u8
>> attributes. It is painful.
>>
>> For something more recent, lets look at this commit from Ben on Flower:
>> +       TCA_FLOWER_KEY_MPLS_TTL,        /* u8 - 8 bits */
>> +       TCA_FLOWER_KEY_MPLS_BOS,        /* u8 - 1 bit */
>> +       TCA_FLOWER_KEY_MPLS_TC,         /* u8 - 3 bits */
>> +       TCA_FLOWER_KEY_MPLS_LABEL,      /* be32 - 20 bits */
>>
>> Yes, that looks pretty, but:
>> That would have fit in one attribute with a u32. Mask attributes would
>> be eliminated with a second 32 bit - all in the same singular
>> attribute.
>>
>> Imagine if i have 1M flower entries. If you add up the mask the cost
>> of these things is about 3*2*64 bits more per entry compared to putting
>> the mpls info/mask in one attribute.
>> At 1M entries that is a few MBs of data being exchanged.
>
> I can do the math :) Yet still, I would like to see the numbers :)
> Because I believe that is the only way to end this lenghty converstation
> once and forever...
>

Jiri, what are you arguing about if you have done the math? ;->
You want me to show you that getting or setting less data is good for
performance?
Look at the third patch: Why do i think it is necessary to send only
actions that have changed? Precisely to reduce the amount of data
being transported. The second patch - to reduce the amount of crossing
user space to kernel space (which is going to happen more with increased
data I have to transport between the user and the kernel).

Again: You are looking at this from a manageability point of view which
is useful but not the only input into a design. If i can squeeze more
data without killing usability - I am all for it. It just doesnt
compute that it is ok to use a flag per attribute because it looks
beautiful.

cheers,
jamal