[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vbftv0bvz6b.fsf@mellanox.com>
Date: Wed, 20 May 2020 10:33:00 +0300
From: Vlad Buslov <vladbu@...lanox.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Vlad Buslov <vladbu@...lanox.com>, Jiri Pirko <jiri@...nulli.us>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Jamal Hadi Salim <jhs@...atatu.com>,
Davide Caratti <dcaratti@...hat.com>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net-next v2 0/4] Implement classifier-action terse dump mode
On Tue 19 May 2020 at 21:39, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Tue, May 19, 2020 at 2:10 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>>
>>
>> On Mon 18 May 2020 at 21:46, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> > On Sun, May 17, 2020 at 11:44 PM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >>
>> >>
>> >> On Sun 17 May 2020 at 22:13, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> >> > On Fri, May 15, 2020 at 4:40 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>> >> >>
>> >> >> Output rate of current upstream kernel TC filter dump implementation if
>> >> >> relatively low (~100k rules/sec depending on configuration). This
>> >> >> constraint impacts performance of software switch implementation that
>> >> >> rely on TC for their datapath implementation and periodically call TC
>> >> >> filter dump to update rules stats. Moreover, TC filter dump output a lot
>> >> >> of static data that don't change during the filter lifecycle (filter
>> >> >> key, specific action details, etc.) which constitutes significant
>> >> >> portion of payload on resulting netlink packets and increases amount of
>> >> >> syscalls necessary to dump all filters on particular Qdisc. In order to
>> >> >> significantly improve filter dump rate this patch sets implement new
>> >> >> mode of TC filter dump operation named "terse dump" mode. In this mode
>> >> >> only parameters necessary to identify the filter (handle, action cookie,
>> >> >> etc.) and data that can change during filter lifecycle (filter flags,
>> >> >> action stats, etc.) are preserved in dump output while everything else
>> >> >> is omitted.
>> >> >>
>> >> >> Userspace API is implemented using new TCA_DUMP_FLAGS tlv with only
>> >> >> available flag value TCA_DUMP_FLAGS_TERSE. Internally, new API requires
>> >> >> individual classifier support (new tcf_proto_ops->terse_dump()
>> >> >> callback). Support for action terse dump is implemented in act API and
>> >> >> don't require changing individual action implementations.
>> >> >
>> >> > Sorry for being late.
>> >> >
>> >> > Why terse dump needs a new ops if it only dumps a subset of the
>> >> > regular dump? That is, why not just pass a boolean flag to regular
>> >> > ->dump() implementation?
>> >> >
>> >> > I guess that might break user-space ABI? At least some netlink
>> >> > attributes are not always dumped anyway, so it does not look like
>> >> > a problem?
>> >> >
>> >> > Thanks.
>> >>
>> >> Hi Cong,
>> >>
>> >> I considered adding a flag to ->dump() callback but decided against it
>> >> for following reasons:
>> >>
>> >> - It complicates fl_dump() code by adding additional conditionals. Not a
>> >> big problem but it seemed better for me to have a standalone callback
>> >> because with combined implementation it is even hard to deduce what
>> >> does terse dump actually output.
>> >
>> > This is not a problem, at least you can add a big if in fl_dump(),
>> > something like:
>> >
>> > if (terse) {
>> > // do terse dump
>> > return 0;
>> > }
>> > // normal dump
>>
>> That is what I was trying to prevent with my implementation: having big
>> "superfunctions" that implement multiple things with branching. Why not
>> just have dedicated callbacks that do exactly one thing?
>
> 1. Saving one unnecessary ops.
> 2. Easier to trace the code: all dumpings are in one implementation.
Okay, I see your point.
>
>>
>> >
>> >>
>> >> - My initial implementation just called regular dump for classifiers
>> >> that don't support terse dump, but in internal review Jiri insisted
>> >> that cls API should fail if it can't satisfy user's request and having
>> >> dedicated callback allows implementation to return an error if
>> >> classifier doesn't define ->terse_dump(). With flag approach it would
>> >> be not trivial to determine if implementation actually uses the flag.
>> >
>> > Hmm? For those not support terse dump, we can just do:
>> >
>> > if (terse)
>> > return -EOPNOTSUPP;
>> > // normal dump goes here
>> >
>> > You just have to pass 'terse' flag to all implementations and let them
>> > to decide whether to support it or not.
>>
>> But why duplicate the same code to all existing cls dump implementations
>> instead of having such check nicely implemented in cls API (via callback
>> existence or a flag)?
>
> I am confused, your fl_terse_dump() also has duplication with fl_dump()...
>
> Thanks.
I meant duplicating the "if terse not supported return -EOPNOTSUPP" in
dump callback of every classifier implementation. With current
implementation cls API handles such case by checking whether classifier
implementation has ->terse_dump() defined and returns error otherwise.
This can also be achieved by having a new classifier flag, in case we
decide to proceed with folding both dump and terse_dump into single
->dump(bool terse) callback.
Powered by blists - more mailing lists