[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoM=EwoXgLW=pxadYjL-OCWE8c-EUTcz57W=vkJmkJp6wZQ@mail.gmail.com>
Date: Sun, 2 Oct 2022 11:27:08 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
Cc: Cong Wang <xiyou.wangcong@...il.com>,
Hangbin Liu <liuhangbin@...il.com>, netdev@...r.kernel.org,
Jiri Pirko <jiri@...nulli.us>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
David Ahern <dsahern@...nel.org>
Subject: Re: [PATCH (repost) net-next] sched: add extack for tfilter_notify
On Sat, Oct 1, 2022 at 4:39 PM Marcelo Ricardo Leitner
<marcelo.leitner@...il.com> wrote:
>
> On Sat, Oct 01, 2022 at 11:39:07AM -0700, Cong Wang wrote:
> > On Thu, Sep 29, 2022 at 11:35:05AM +0800, Hangbin Liu wrote:
> > > In commit 81c7288b170a ("sched: cls: enable verbose logging") Marcelo
> > > made cls could log verbose info for offloading failures, which helps
> > > improving Open vSwitch debuggability when using flower offloading.
> > >
> > > It would also be helpful if "tc monitor" could log this message, as it
> > > doesn't require vswitchd log level adjusment. Let's add the extack message
> > > in tfilter_notify so the monitor program could receive the failures.
> > > e.g.
> > >
> >
> > I don't think tc monitor is supposed to carry any error messages, it
> > only serves the purpose for monitoring control path events.
>
> But, precisely. In the example Hangbin gave, it is showing why the
> entry is not_in_hw. That's still data that belongs to the event that
> happened and that can't be queried afterwards even if the user/app
> monitoring it want to. Had it failed entirely, I agree, as the control
> path never changed.
>
> tc monitor is easier to use than perf probes in some systems. It's not
> uncommon to have tc installed but not perf. It's also easier to ask a
> customer to run it than explain how to enable the tracepoint and print
> ftrace buffer via /sys files, and the output is more meaningful for us
> as well: we know exactly which filter triggered the message. The only
> other place that we can correlate the filter and the warning, is on
> vswitchd log. Which is not easy to read either.
To Jakub's point: I think one of those NLMSGERR TLVs is the right place
and while traces look attractive I see the value of having a unified
collection point via the tc monitor.
Since you cant really batch events - it seems the NLMSG_DONE/MULTI
hack is done just to please iproute2::tc?
IMO:
I think if you need to do this, then you have to teach iproute2
new ways of interpreting the message (which is nice because you
dont have to worry about backward compat). Some of that code
should be centralized and reused by netlink generically
instead of just cls_api, example the whole NLM_F_ACK_TLVS dance.
Also - i guess it will depend on the underlying driver?
This seems very related to a specific driver:
"Warning: mlx5_core: matching on ct_state +new isn't supported."
Debuggability is always great but so is backwards compat.
What happens when you run old userspace tc? There are tons
of punting systems that process these events out there and
depend on the current event messages as is.
cheers,
jamal
Powered by blists - more mailing lists