[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87pl6hzveb.fsf@toke.dk>
Date: Fri, 06 Feb 2026 17:31:08 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Jesper Dangaard Brouer <hawk@...nel.org>, atenart@...hat.com,
netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>, Paolo Abeni
<pabeni@...hat.com>
Cc: bpf@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>, horms@...nel.org,
jiri@...nulli.us, edumazet@...gle.com, xiyou.wangcong@...il.com,
jhs@...atatu.com, carges@...udflare.com, kernel-team@...udflare.com,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH net-next v2 1/6] net: sched: introduce qdisc-specific
drop reason tracing
Jesper Dangaard Brouer <hawk@...nel.org> writes:
> On 06/02/2026 10.31, Toke Høiland-Jørgensen wrote:
>> Jesper Dangaard Brouer <hawk@...nel.org> writes:
>>
>>> Create new enum qdisc_drop_reason and trace_qdisc_drop tracepoint
>>> for qdisc layer drop diagnostics with direct qdisc context visibility.
>>>
>>> The new tracepoint includes qdisc handle, parent, kind (name), and
>>> device information. Existing SKB_DROP_REASON_QDISC_DROP is retained
>>> for backwards compatibility via kfree_skb_reason().
>>>
>>> Convert FQ, FQ_CoDel, CoDel, SFB, and pfifo_fast to use the new
>>> infrastructure.
>>>
>>> Signed-off-by: Jesper Dangaard Brouer <hawk@...nel.org>
> [...]
>>> diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
>>> index a7b7abd66e21..3d8d284e05c8 100644
>>> --- a/include/net/dropreason-core.h
>>> +++ b/include/net/dropreason-core.h
>>> @@ -68,12 +68,6 @@
>>> FN(SECURITY_HOOK) \
>>> FN(QDISC_DROP) \
>>> FN(QDISC_BURST_DROP) \
>>> - FN(QDISC_OVERLIMIT) \
>>> - FN(QDISC_CONGESTED) \
>>> - FN(CAKE_FLOOD) \
>>> - FN(FQ_BAND_LIMIT) \
>>> - FN(FQ_HORIZON_LIMIT) \
>>> - FN(FQ_FLOW_LIMIT) \
>>> FN(CPU_BACKLOG) \
>>> FN(XDP) \
>>> FN(TC_INGRESS) \
>>> @@ -371,8 +365,10 @@ enum skb_drop_reason {
>>> /** @SKB_DROP_REASON_SECURITY_HOOK: dropped due to security HOOK */
>>> SKB_DROP_REASON_SECURITY_HOOK,
>>> /**
>>> - * @SKB_DROP_REASON_QDISC_DROP: dropped by qdisc when packet outputting (
>>> - * failed to enqueue to current qdisc)
>>> + * @SKB_DROP_REASON_QDISC_DROP: dropped by qdisc during enqueue or
>>> + * dequeue. More specific drop reasons are available via the
>>> + * qdisc:qdisc_drop tracepoint, which also provides qdisc handle
>>> + * and name for identifying the source.
>>
>> IIUC, this is not needed, see below:
>>
>> [...]
>>
>>> @@ -37,6 +37,24 @@
>>> const struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops;
>>> EXPORT_SYMBOL(default_qdisc_ops);
>>>
>>> +void tcf_kfree_skb_list(struct sk_buff *skb, struct Qdisc *q,
>>> + struct netdev_queue *txq,
>>> + struct net_device *dev)
>>> +{
>>> + while (unlikely(skb)) {
>>> + struct sk_buff *next = skb->next;
>>> + enum qdisc_drop_reason reason = tcf_get_qdisc_drop_reason(skb);
>>> +
>>> + prefetch(next);
>>> + /* Catch wrong enum: skb_drop_reason vs qdisc_drop_reason */
>>> + WARN_ON_ONCE(reason && reason < __QDISC_DROP_REASON);
>>> + trace_qdisc_drop(q, txq, dev, skb, reason);
>>> + kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP);
>>
>> AFAIU, the idea here is that you just pass the same 'reason' value to
>> kfree_skb_reason(). Because of the subsys shift and offset, these will
>> all be unique values, so anyone just watching the old tracepoint will
>> get QDISC_DROP_* reasons with no context, and if you want the context
>> you listen to the qdisc_drop tracepoint.
>
> I'm already changing this code in V3, as syzbot revealed that some
> existing skb_drop_reason's do arrive here, which is the TC filter drop
> reasons (SKB_DROP_REASON_TC_*). We primarily need subsys encoding as
> we share the SKB->cb area.
>
> This is the new code, that pass 'reason' through:
>
> /* TC classifier and qdisc share drop_reason storage.
> * Check subsystem mask to identify qdisc drop reasons,
> * else pass through skb_drop_reason set by TC classifier.
> */
> if ((reason & SKB_DROP_REASON_SUBSYS_MASK) == __QDISC_DROP_REASON) {
> trace_qdisc_drop(q, txq, dev, skb, reason);
> skb_reason = SKB_DROP_REASON_QDISC_DROP;
> } else {
> skb_reason = (enum skb_drop_reason)reason;
> }
> kfree_skb_reason(skb, skb_reason);
>
> I do realize that the reason is now a unique value, that in principle
> could be passed through, *BUT* that will break existing userspace
> tools. E.g. the `perf trace -e skb:kfree_skb` cannot decode these to
> strings. (The net/core/drop_monitor.c also need special handling.)
Ah, that's a bit unfortunate. I assumed the tools would handle these
correctly automatically :(
> I actually want to give userspace a *single* SKB_DROP_REASON_QDISC_DROP
> reason (that existing tools already knows).
>
> Leaking extra qdisc drop reasons is not helpful to consume as it is not
> actionable without knowing the qdisc. A QDISC_DROP_OVERLIMIT from FQ is
> very different from one in CAKE - knowing just the reason without the
> qdisc type/config is insufficient for debugging. The trace_qdisc_drop
> tracepoint provides both the reason AND the qdisc context together
Depends; if you only have one of those installed on your system it'll do
just fine :)
In any case we're changing things here (removing reasons, or changing
them). I would lean towards having more information available (i.e.,
passing through the reason), but if common tools can't decode them,
that's a bit of a bummer.
I suppose it'll be too much churn to start out with the single reason,
then fix the tools to understand subsystem drop reasons, then change it
again?
-Toke
Powered by blists - more mailing lists