[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpV3LjswT8pwGc755Ncc0cT1qH433KhD8VZ-7FKQOTs3Fg@mail.gmail.com>
Date: Sun, 11 Jul 2021 20:34:32 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Tonghao Zhang <xiangxia.m.yue@...il.com>,
Qitao Xu <qitao.xu@...edance.com>,
Cong Wang <cong.wang@...edance.com>,
Jamal Hadi Salim <jhs@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>
Subject: Re: [Patch net-next v2] net_sched: introduce tracepoint trace_qdisc_enqueue()
On Sun, Jul 11, 2021 at 8:01 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2021/7/12 3:03, Cong Wang wrote:
> > From: Qitao Xu <qitao.xu@...edance.com>
> >
> > Tracepoint trace_qdisc_enqueue() is introduced to trace skb at
> > the entrance of TC layer on TX side. This is kinda symmetric to
> > trace_qdisc_dequeue(), and together they can be used to calculate
> > the packet queueing latency. It is more accurate than
> > trace_net_dev_queue(), because we already successfully enqueue
> > the packet at that point.
> >
> > Note, trace ring buffer is only accessible to privileged users,
> > it is safe to use %px to print a real kernel address here.
> >
> > Reviewed-by: Cong Wang <cong.wang@...edance.com>
> > Cc: Jamal Hadi Salim <jhs@...atatu.com>
> > Cc: Jiri Pirko <jiri@...nulli.us>
> > Signed-off-by: Qitao Xu <qitao.xu@...edance.com>
> > ---
> > include/trace/events/qdisc.h | 26 ++++++++++++++++++++++++++
> > net/core/dev.c | 9 +++++++++
> > 2 files changed, 35 insertions(+)
> >
> > diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h
> > index 58209557cb3a..c3006c6b4a87 100644
> > --- a/include/trace/events/qdisc.h
> > +++ b/include/trace/events/qdisc.h
> > @@ -46,6 +46,32 @@ TRACE_EVENT(qdisc_dequeue,
> > __entry->txq_state, __entry->packets, __entry->skbaddr )
> > );
> >
> > +TRACE_EVENT(qdisc_enqueue,
> > +
> > + TP_PROTO(struct Qdisc *qdisc, const struct netdev_queue *txq, struct sk_buff *skb),
> > +
> > + TP_ARGS(qdisc, txq, skb),
> > +
> > + TP_STRUCT__entry(
> > + __field(struct Qdisc *, qdisc)
> > + __field(void *, skbaddr)
> > + __field(int, ifindex)
> > + __field(u32, handle)
> > + __field(u32, parent)
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->qdisc = qdisc;
> > + __entry->skbaddr = skb;
> > + __entry->ifindex = txq->dev ? txq->dev->ifindex : 0;
> > + __entry->handle = qdisc->handle;
> > + __entry->parent = qdisc->parent;
> > + ),
> > +
> > + TP_printk("enqueue ifindex=%d qdisc handle=0x%X parent=0x%X skbaddr=%px",
> > + __entry->ifindex, __entry->handle, __entry->parent, __entry->skbaddr)
> > +);
> > +
> > TRACE_EVENT(qdisc_reset,
> >
> > TP_PROTO(struct Qdisc *q),
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index c253c2aafe97..20b9376de301 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -131,6 +131,7 @@
> > #include <trace/events/napi.h>
> > #include <trace/events/net.h>
> > #include <trace/events/skb.h>
> > +#include <trace/events/qdisc.h>
> > #include <linux/inetdevice.h>
> > #include <linux/cpu_rmap.h>
> > #include <linux/static_key.h>
> > @@ -3864,6 +3865,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
> > if (unlikely(!nolock_qdisc_is_empty(q))) {
> > rc = q->enqueue(skb, q, &to_free) &
> > NET_XMIT_MASK;
> > + if (rc == NET_XMIT_SUCCESS)
>
> If NET_XMIT_CN is returned, the skb seems to be enqueued too?
Sure. See the other reply from on why dropped packets are not
interesting here.
>
> Also instead of checking the rc before calling the trace_*, maybe
> it make more sense to add the rc to the tracepoint, so that the checking
> is avoided, and we are able to tell the enqueuing result of a specific skb
> from that tracepoint too.
Totally disagree, because trace_qdisc_dequeue() is only called for
successful cases too (see dequeue_skb()), it does not make sense
to let them be different.
>
> > + trace_qdisc_enqueue(q, txq, skb);
>
> Does it make sense to wrap the about to something like:
Nope. Because ->enqueue() is called by lower layer qdisc's
too, but here we only want to track root, aka, entrance of TC.
I know this may be confusing, please blame trace_qdisc_dequeue()
which only tracks the exit. ;)
Thanks.
Powered by blists - more mailing lists