[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKDx52BnKZhw=hpCCG1dHtXOGx8pbynDoFRE0h_+a7JhQ@mail.gmail.com>
Date: Sun, 9 Nov 2025 11:28:33 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Jonas Köppeler <j.koeppeler@...berlin.de>
Cc: Toke Høiland-Jørgensen <toke@...hat.com>,
"David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>,
Kuniyuki Iwashima <kuniyu@...gle.com>, Willem de Bruijn <willemb@...gle.com>, netdev@...r.kernel.org,
eric.dumazet@...il.com
Subject: Re: [PATCH v1 net-next 5/5] net: dev_queue_xmit() llist adoption
On Sun, Nov 9, 2025 at 11:18 AM Jonas Köppeler <j.koeppeler@...berlin.de> wrote:
>
> On 11/9/25 5:33 PM, Toke Høiland-Jørgensen wrote:
> > Not sure why there's this difference between your setup or mine; some
> > .config or hardware difference related to the use of atomics? Any other
> > ideas?
>
> Hi Eric, hi Toke,
>
> I observed a similar behavior where CAKE's throughput collapses after the patch.
>
> Test setup:
> - 4 queues CAKE root qdisc
Please send
tc -s -d qd sh
> - 64-byte packets at ~21 Mpps
> - Intel Xeon Gold 6209U + 25GbE Intel XXV710 NIC
> - DuT forwards incoming traffic back to traffic generator through cake
>
> Throughput over 10 seconds before/after patch:
>
> Before patch:
> 0.475 mpps
> 0.481 mpps
> 0.477 mpps
> 0.478 mpps
> 0.478 mpps
> 0.477 mpps
> 0.479 mpps
> 0.481 mpps
> 0.481 mpps
>
> After patch:
> 0.265 mpps
> 0.035 mpps
> 0.003 mpps
> 0.002 mpps
> 0.001 mpps
> 0.002 mpps
> 0.002 mpps
> 0.002 mpps
> 0.002 mpps
>
> ---
>
>
> From the qdisc I also see a large number of drops. Running:
>
> perf record -a -e skb:kfree_skb
>
> shows `QDISC_OVERLIMIT` and `CAKE_FLOOD` as the drop reasons.
Cake drops packets from dequeue() while the qdisc spinlock is held,
unfortunately.
So it is quite possible that feeding more packets to the qdisc than before
enters a mode where dequeue() has to drop more packets and slow down
the whole thing.
Presumably cake enqueue() should 'drop' the packet when the queue is
under high pressure,
because enqueue() can drop the packet without holding the qdisc spinlock.
>
> `tc` statistics before/after the patch:
>
> Before patch:
> - drops: 32
> - packets: 4,786,109
> - memory_used: 8,916,480
> - requeues: 254
>
> After patch:
> - drops: 13,601,075
> - packets: 322,540
> - memory_used: 15,504,576
> - requeues: 273
>
> ---
>
> Call graph of `__dev_queue_xmit` after the patch (CPU time percentages):
>
> 53.37% __dev_queue_xmit
> 21.02% __qdisc_run
> 13.79% sch_direct_xmit
> 12.01% _raw_spin_lock
> 11.30% do_raw_spin_lock
> 11.06% __pv_queued_spin_lock_slowpath
> 0.73% _raw_spin_unlock
> 0.58% lock_release
> 0.69% dev_hard_start_xmit
> 6.91% cake_dequeue
> 1.82% sk_skb_reason_drop
> 1.10% skb_release_data
> 0.65% kfree_skbmem
> 0.61% kmem_cache_free
> 1.64% get_random_u32
> 0.97% ktime_get
> 0.86% seqcount_lockdep_reader_access.constprop.0
> 0.91% cake_dequeue_one
> 16.49% _raw_spin_lock
> 15.71% do_raw_spin_lock
> 15.54% __pv_queued_spin_lock_slowpath
> 10.00% dev_qdisc_enqueue
> 9.94% cake_enqueue
> 4.90% cake_hash
> 2.85% __skb_flow_dissect
> 1.08% lock_acquire
> 0.65% lock_release
> 1.17% __siphash_unaligned
> 2.20% ktime_get
> 1.94% seqcount_lockdep_reader_access.constprop.0
> 0.69% cake_get_flow_quantum / get_random_u16
> 1.99% netdev_core_pick_tx
> 1.79% i40e_lan_select_queue
> 1.62% netdev_pick_tx
> 0.78% lock_acquire
> 0.52% lock_release
> 0.82% lock_acquire
> 0.76% kfree_skb_list_reason
> 0.52% skb_release_data
> 1.02% lock_acquire
> 0.63% lock_release
>
> ---
>
> The `_raw_spin_lock` portion under `__qdisc_run -> sch_direct_xmit` is slightly higher after the patch compared to before (from 5.68% to 12.01%).
> It feels like once sch_cake starts dropping packets it (due to overlimit and cobalt-drops) the throughput collapses. Could it be that the overlimit
> is reached "faster" when there are more CPUs trying to enqueue packets, thus reaching cake's queue limit due to the "batch" enqueue behavior,
> which then leads to cake starting to drop packets?
>
Yes, probably.
Powered by blists - more mailing lists