netdev - Re: [PATCH v1 net-next 5/5] net: dev_queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ef601e55-16a0-4d3e-bd0d-536ed9dd29cd@tu-berlin.de>
Date: Sun, 9 Nov 2025 20:18:08 +0100
From: Jonas Köppeler <j.koeppeler@...berlin.de>
To: Toke Høiland-Jørgensen <toke@...hat.com>, "Eric
 Dumazet" <edumazet@...gle.com>
CC: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski
	<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman
	<horms@...nel.org>, Jamal Hadi Salim <jhs@...atatu.com>, Cong Wang
	<xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>, Kuniyuki Iwashima
	<kuniyu@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
	<netdev@...r.kernel.org>, <eric.dumazet@...il.com>
Subject: Re: [PATCH v1 net-next 5/5] net: dev_queue_xmit() llist adoption

On 11/9/25 5:33 PM, Toke Høiland-Jørgensen wrote:
> Not sure why there's this difference between your setup or mine; some
> .config or hardware difference related to the use of atomics? Any other
> ideas?

Hi Eric, hi Toke,

I observed a similar behavior where CAKE's throughput collapses after the patch.

Test setup:
- 4 queues CAKE root qdisc
- 64-byte packets at ~21 Mpps
- Intel Xeon Gold 6209U + 25GbE Intel XXV710 NIC
- DuT forwards incoming traffic back to traffic generator through cake

Throughput over 10 seconds before/after patch:

Before patch:
0.475   mpps
0.481   mpps
0.477   mpps
0.478   mpps
0.478   mpps
0.477   mpps
0.479   mpps
0.481   mpps
0.481   mpps

After patch:
0.265  mpps
0.035  mpps
0.003  mpps
0.002  mpps
0.001  mpps
0.002  mpps
0.002  mpps
0.002  mpps
0.002  mpps

---


 From the qdisc I also see a large number of drops. Running:

     perf record -a -e skb:kfree_skb

shows `QDISC_OVERLIMIT` and `CAKE_FLOOD` as the drop reasons.

`tc` statistics before/after the patch:

Before patch:
- drops: 32
- packets: 4,786,109
- memory_used: 8,916,480
- requeues: 254

After patch:
- drops: 13,601,075
- packets: 322,540
- memory_used: 15,504,576
- requeues: 273

---

Call graph of `__dev_queue_xmit` after the patch (CPU time percentages):

53.37%  __dev_queue_xmit
   21.02%  __qdisc_run
     13.79%  sch_direct_xmit
       12.01%  _raw_spin_lock
         11.30%  do_raw_spin_lock
           11.06%  __pv_queued_spin_lock_slowpath
     0.73%  _raw_spin_unlock
       0.58%  lock_release
     0.69%  dev_hard_start_xmit
     6.91%  cake_dequeue
       1.82%  sk_skb_reason_drop
         1.10%  skb_release_data
         0.65%  kfree_skbmem
           0.61%  kmem_cache_free
       1.64%  get_random_u32
       0.97%  ktime_get
         0.86%  seqcount_lockdep_reader_access.constprop.0
       0.91%  cake_dequeue_one
   16.49%  _raw_spin_lock
     15.71%  do_raw_spin_lock
       15.54%  __pv_queued_spin_lock_slowpath
   10.00%  dev_qdisc_enqueue
     9.94%  cake_enqueue
       4.90%  cake_hash
       2.85%  __skb_flow_dissect
         1.08%  lock_acquire
         0.65%  lock_release
       1.17%  __siphash_unaligned
       2.20%  ktime_get
         1.94%  seqcount_lockdep_reader_access.constprop.0
       0.69%  cake_get_flow_quantum / get_random_u16
   1.99%  netdev_core_pick_tx
     1.79%  i40e_lan_select_queue
     1.62%  netdev_pick_tx
       0.78%  lock_acquire
       0.52%  lock_release
     0.82%  lock_acquire
   0.76%  kfree_skb_list_reason
     0.52%  skb_release_data
   1.02%  lock_acquire
     0.63%  lock_release

---

The `_raw_spin_lock` portion under `__qdisc_run -> sch_direct_xmit` is slightly higher after the patch compared to before (from 5.68% to 12.01%).
It feels like once sch_cake starts dropping packets it (due to overlimit and cobalt-drops) the throughput collapses. Could it be that the overlimit
is reached "faster" when there are more CPUs trying to enqueue packets, thus reaching cake's queue limit due to the "batch" enqueue behavior,
which then leads to cake starting to drop packets?


Jonas