netdev - Re: [PATCH net-next v2 0/4] Implement classifier-action terse dump mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vbftv0bvz6b.fsf@mellanox.com>
Date:   Wed, 20 May 2020 10:33:00 +0300
From:   Vlad Buslov <vladbu@...lanox.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Vlad Buslov <vladbu@...lanox.com>, Jiri Pirko <jiri@...nulli.us>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Davide Caratti <dcaratti@...hat.com>,
        Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
        Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net-next v2 0/4] Implement classifier-action terse dump mode


On Tue 19 May 2020 at 21:39, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Tue, May 19, 2020 at 2:10 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>>
>>
>> On Mon 18 May 2020 at 21:46, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> > On Sun, May 17, 2020 at 11:44 PM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >>
>> >>
>> >> On Sun 17 May 2020 at 22:13, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> >> > On Fri, May 15, 2020 at 4:40 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >> >>
>> >> >> Output rate of current upstream kernel TC filter dump implementation if
>> >> >> relatively low (~100k rules/sec depending on configuration). This
>> >> >> constraint impacts performance of software switch implementation that
>> >> >> rely on TC for their datapath implementation and periodically call TC
>> >> >> filter dump to update rules stats. Moreover, TC filter dump output a lot
>> >> >> of static data that don't change during the filter lifecycle (filter
>> >> >> key, specific action details, etc.) which constitutes significant
>> >> >> portion of payload on resulting netlink packets and increases amount of
>> >> >> syscalls necessary to dump all filters on particular Qdisc. In order to
>> >> >> significantly improve filter dump rate this patch sets implement new
>> >> >> mode of TC filter dump operation named "terse dump" mode. In this mode
>> >> >> only parameters necessary to identify the filter (handle, action cookie,
>> >> >> etc.) and data that can change during filter lifecycle (filter flags,
>> >> >> action stats, etc.) are preserved in dump output while everything else
>> >> >> is omitted.
>> >> >>
>> >> >> Userspace API is implemented using new TCA_DUMP_FLAGS tlv with only
>> >> >> available flag value TCA_DUMP_FLAGS_TERSE. Internally, new API requires
>> >> >> individual classifier support (new tcf_proto_ops->terse_dump()
>> >> >> callback). Support for action terse dump is implemented in act API and
>> >> >> don't require changing individual action implementations.
>> >> >
>> >> > Sorry for being late.
>> >> >
>> >> > Why terse dump needs a new ops if it only dumps a subset of the
>> >> > regular dump? That is, why not just pass a boolean flag to regular
>> >> > ->dump() implementation?
>> >> >
>> >> > I guess that might break user-space ABI? At least some netlink
>> >> > attributes are not always dumped anyway, so it does not look like
>> >> > a problem?
>> >> >
>> >> > Thanks.
>> >>
>> >> Hi Cong,
>> >>
>> >> I considered adding a flag to ->dump() callback but decided against it
>> >> for following reasons:
>> >>
>> >> - It complicates fl_dump() code by adding additional conditionals. Not a
>> >>   big problem but it seemed better for me to have a standalone callback
>> >>   because with combined implementation it is even hard to deduce what
>> >>   does terse dump actually output.
>> >
>> > This is not a problem, at least you can add a big if in fl_dump(),
>> > something like:
>> >
>> > if (terse) {
>> >   // do terse dump
>> >   return 0;
>> > }
>> > // normal dump
>>
>> That is what I was trying to prevent with my implementation: having big
>> "superfunctions" that implement multiple things with branching. Why not
>> just have dedicated callbacks that do exactly one thing?
>
> 1. Saving one unnecessary ops.
> 2. Easier to trace the code: all dumpings are in one implementation.

Okay, I see your point.

>
>>
>> >
>> >>
>> >> - My initial implementation just called regular dump for classifiers
>> >>   that don't support terse dump, but in internal review Jiri insisted
>> >>   that cls API should fail if it can't satisfy user's request and having
>> >>   dedicated callback allows implementation to return an error if
>> >>   classifier doesn't define ->terse_dump(). With flag approach it would
>> >>   be not trivial to determine if implementation actually uses the flag.
>> >
>> > Hmm? For those not support terse dump, we can just do:
>> >
>> > if (terse)
>> >   return -EOPNOTSUPP;
>> > // normal dump goes here
>> >
>> > You just have to pass 'terse' flag to all implementations and let them
>> > to decide whether to support it or not.
>>
>> But why duplicate the same code to all existing cls dump implementations
>> instead of having such check nicely implemented in cls API (via callback
>> existence or a flag)?
>
> I am confused, your fl_terse_dump() also has duplication with fl_dump()...
>
> Thanks.

I meant duplicating the "if terse not supported return -EOPNOTSUPP" in
dump callback of every classifier implementation. With current
implementation cls API handles such case by checking whether classifier
implementation has ->terse_dump() defined and returns error otherwise.
This can also be achieved by having a new classifier flag, in case we
decide to proceed with folding both dump and terse_dump into single
->dump(bool terse) callback.