lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87pl6hzveb.fsf@toke.dk>
Date: Fri, 06 Feb 2026 17:31:08 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Jesper Dangaard Brouer <hawk@...nel.org>, atenart@...hat.com,
 netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>, Paolo Abeni
 <pabeni@...hat.com>
Cc: bpf@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>, horms@...nel.org,
 jiri@...nulli.us, edumazet@...gle.com, xiyou.wangcong@...il.com,
 jhs@...atatu.com, carges@...udflare.com, kernel-team@...udflare.com,
 "David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH net-next v2 1/6] net: sched: introduce qdisc-specific
 drop reason tracing

Jesper Dangaard Brouer <hawk@...nel.org> writes:

> On 06/02/2026 10.31, Toke Høiland-Jørgensen wrote:
>> Jesper Dangaard Brouer <hawk@...nel.org> writes:
>> 
>>> Create new enum qdisc_drop_reason and trace_qdisc_drop tracepoint
>>> for qdisc layer drop diagnostics with direct qdisc context visibility.
>>>
>>> The new tracepoint includes qdisc handle, parent, kind (name), and
>>> device information. Existing SKB_DROP_REASON_QDISC_DROP is retained
>>> for backwards compatibility via kfree_skb_reason().
>>>
>>> Convert FQ, FQ_CoDel, CoDel, SFB, and pfifo_fast to use the new
>>> infrastructure.
>>>
>>> Signed-off-by: Jesper Dangaard Brouer <hawk@...nel.org>
> [...]
>>> diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
>>> index a7b7abd66e21..3d8d284e05c8 100644
>>> --- a/include/net/dropreason-core.h
>>> +++ b/include/net/dropreason-core.h
>>> @@ -68,12 +68,6 @@
>>>   	FN(SECURITY_HOOK)		\
>>>   	FN(QDISC_DROP)			\
>>>   	FN(QDISC_BURST_DROP)		\
>>> -	FN(QDISC_OVERLIMIT)		\
>>> -	FN(QDISC_CONGESTED)		\
>>> -	FN(CAKE_FLOOD)			\
>>> -	FN(FQ_BAND_LIMIT)		\
>>> -	FN(FQ_HORIZON_LIMIT)		\
>>> -	FN(FQ_FLOW_LIMIT)		\
>>>   	FN(CPU_BACKLOG)			\
>>>   	FN(XDP)				\
>>>   	FN(TC_INGRESS)			\
>>> @@ -371,8 +365,10 @@ enum skb_drop_reason {
>>>   	/** @SKB_DROP_REASON_SECURITY_HOOK: dropped due to security HOOK */
>>>   	SKB_DROP_REASON_SECURITY_HOOK,
>>>   	/**
>>> -	 * @SKB_DROP_REASON_QDISC_DROP: dropped by qdisc when packet outputting (
>>> -	 * failed to enqueue to current qdisc)
>>> +	 * @SKB_DROP_REASON_QDISC_DROP: dropped by qdisc during enqueue or
>>> +	 * dequeue. More specific drop reasons are available via the
>>> +	 * qdisc:qdisc_drop tracepoint, which also provides qdisc handle
>>> +	 * and name for identifying the source.
>> 
>> IIUC, this is not needed, see below:
>> 
>> [...]
>> 
>>> @@ -37,6 +37,24 @@
>>>   const struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops;
>>>   EXPORT_SYMBOL(default_qdisc_ops);
>>>   
>>> +void tcf_kfree_skb_list(struct sk_buff *skb, struct Qdisc *q,
>>> +			struct netdev_queue *txq,
>>> +			struct net_device *dev)
>>> +{
>>> +	while (unlikely(skb)) {
>>> +		struct sk_buff *next = skb->next;
>>> +		enum qdisc_drop_reason reason = tcf_get_qdisc_drop_reason(skb);
>>> +
>>> +		prefetch(next);
>>> +		/* Catch wrong enum: skb_drop_reason vs qdisc_drop_reason */
>>> +		WARN_ON_ONCE(reason && reason < __QDISC_DROP_REASON);
>>> +		trace_qdisc_drop(q, txq, dev, skb, reason);
>>> +		kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP);
>> 
>> AFAIU, the idea here is that you just pass the same 'reason' value to
>> kfree_skb_reason(). Because of the subsys shift and offset, these will
>> all be unique values, so anyone just watching the old tracepoint will
>> get QDISC_DROP_* reasons with no context, and if you want the context
>> you listen to the qdisc_drop tracepoint.
>
> I'm already changing this code in V3, as syzbot revealed that some
> existing skb_drop_reason's do arrive here, which is the TC filter drop
> reasons (SKB_DROP_REASON_TC_*).  We primarily need subsys encoding as
> we share the SKB->cb area.
>
> This is the new code, that pass 'reason' through:
>
>   /* TC classifier and qdisc share drop_reason storage.
>    * Check subsystem mask to identify qdisc drop reasons,
>    * else pass through skb_drop_reason set by TC classifier.
>    */
>   if ((reason & SKB_DROP_REASON_SUBSYS_MASK) == __QDISC_DROP_REASON) {
>   	trace_qdisc_drop(q, txq, dev, skb, reason);
>   	skb_reason = SKB_DROP_REASON_QDISC_DROP;
>   } else {
>   	skb_reason = (enum skb_drop_reason)reason;
>   }
>   kfree_skb_reason(skb, skb_reason);
>
> I do realize that the reason is now a unique value, that in principle
> could be passed through, *BUT* that will break existing userspace
> tools.  E.g. the `perf trace -e skb:kfree_skb` cannot decode these to
> strings. (The net/core/drop_monitor.c also need special handling.)

Ah, that's a bit unfortunate. I assumed the tools would handle these
correctly automatically :(

> I actually want to give userspace a *single* SKB_DROP_REASON_QDISC_DROP
> reason (that existing tools already knows).
>
> Leaking extra qdisc drop reasons is not helpful to consume as it is not
> actionable without knowing the qdisc. A QDISC_DROP_OVERLIMIT from FQ is
> very different from one in CAKE - knowing just the reason without the
> qdisc type/config is insufficient for debugging. The trace_qdisc_drop
> tracepoint provides both the reason AND the qdisc context together

Depends; if you only have one of those installed on your system it'll do
just fine :)

In any case we're changing things here (removing reasons, or changing
them). I would lean towards having more information available (i.e.,
passing through the reason), but if common tools can't decode them,
that's a bit of a bummer.

I suppose it'll be too much churn to start out with the single reason,
then fix the tools to understand subsystem drop reasons, then change it
again?

-Toke


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ