[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpXdyFqBO=AkmLqVW=dZxQ3SfjKp71BxsKRuyhaoVuMEfg@mail.gmail.com>
Date: Mon, 18 May 2020 11:46:21 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Vlad Buslov <vladbu@...lanox.com>
Cc: Jiri Pirko <jiri@...nulli.us>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Jamal Hadi Salim <jhs@...atatu.com>,
Davide Caratti <dcaratti@...hat.com>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net-next v2 0/4] Implement classifier-action terse dump mode
On Sun, May 17, 2020 at 11:44 PM Vlad Buslov <vladbu@...lanox.com> wrote:
>
>
> On Sun 17 May 2020 at 22:13, Cong Wang <xiyou.wangcong@...il.com> wrote:
> > On Fri, May 15, 2020 at 4:40 AM Vlad Buslov <vladbu@...lanox.com> wrote:
> >>
> >> Output rate of current upstream kernel TC filter dump implementation if
> >> relatively low (~100k rules/sec depending on configuration). This
> >> constraint impacts performance of software switch implementation that
> >> rely on TC for their datapath implementation and periodically call TC
> >> filter dump to update rules stats. Moreover, TC filter dump output a lot
> >> of static data that don't change during the filter lifecycle (filter
> >> key, specific action details, etc.) which constitutes significant
> >> portion of payload on resulting netlink packets and increases amount of
> >> syscalls necessary to dump all filters on particular Qdisc. In order to
> >> significantly improve filter dump rate this patch sets implement new
> >> mode of TC filter dump operation named "terse dump" mode. In this mode
> >> only parameters necessary to identify the filter (handle, action cookie,
> >> etc.) and data that can change during filter lifecycle (filter flags,
> >> action stats, etc.) are preserved in dump output while everything else
> >> is omitted.
> >>
> >> Userspace API is implemented using new TCA_DUMP_FLAGS tlv with only
> >> available flag value TCA_DUMP_FLAGS_TERSE. Internally, new API requires
> >> individual classifier support (new tcf_proto_ops->terse_dump()
> >> callback). Support for action terse dump is implemented in act API and
> >> don't require changing individual action implementations.
> >
> > Sorry for being late.
> >
> > Why terse dump needs a new ops if it only dumps a subset of the
> > regular dump? That is, why not just pass a boolean flag to regular
> > ->dump() implementation?
> >
> > I guess that might break user-space ABI? At least some netlink
> > attributes are not always dumped anyway, so it does not look like
> > a problem?
> >
> > Thanks.
>
> Hi Cong,
>
> I considered adding a flag to ->dump() callback but decided against it
> for following reasons:
>
> - It complicates fl_dump() code by adding additional conditionals. Not a
> big problem but it seemed better for me to have a standalone callback
> because with combined implementation it is even hard to deduce what
> does terse dump actually output.
This is not a problem, at least you can add a big if in fl_dump(),
something like:
if (terse) {
// do terse dump
return 0;
}
// normal dump
>
> - My initial implementation just called regular dump for classifiers
> that don't support terse dump, but in internal review Jiri insisted
> that cls API should fail if it can't satisfy user's request and having
> dedicated callback allows implementation to return an error if
> classifier doesn't define ->terse_dump(). With flag approach it would
> be not trivial to determine if implementation actually uses the flag.
Hmm? For those not support terse dump, we can just do:
if (terse)
return -EOPNOTSUPP;
// normal dump goes here
You just have to pass 'terse' flag to all implementations and let them
to decide whether to support it or not.
> I guess I could have added new tcf_proto_ops->flags value to designate
> terse dump support, but checking for dedicated callback existence
> seemed like obvious approach.
This does not look necessary, as long as we can just pass the flag
down to each ->dump().
Thanks.
Powered by blists - more mailing lists