[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLZBMWpU7kMjd8akT+L8FbsnO+wqgjCaXF2KOCFz9Hiag@mail.gmail.com>
Date: Wed, 15 Oct 2025 03:40:33 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Wang Liang <wangliang74@...wei.com>
Cc: nhorman@...driver.com, davem@...emloft.net, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, yuehaibing@...wei.com,
zhangchangzhong@...wei.com
Subject: Re: [PATCH RFC net-next] net: drop_monitor: Add debugfs support
On Wed, Oct 15, 2025 at 2:51 AM Wang Liang <wangliang74@...wei.com> wrote:
>
> This patch add debugfs interfaces for drop monitor. Similar to kmemleak, we
> can use the monitor by below commands:
>
> echo clear > /sys/kernel/debug/drop_monitor/trace
> echo start > /sys/kernel/debug/drop_monitor/trace
> echo stop > /sys/kernel/debug/drop_monitor/trace
> cat /sys/kernel/debug/drop_monitor/trace
>
> The trace skb number limit can be set dynamically:
>
> cat /sys/kernel/debug/drop_monitor/trace_limit
> echo 200 > /sys/kernel/debug/drop_monitor/trace_limit
>
> Compare to original netlink method, the callstack dump is supported. There
> is a example for received udp packet with error checksum:
>
> reason : UDP_CSUM (11)
> pc : udp_queue_rcv_one_skb+0x14b/0x350
> len : 12
> protocol : 0x0800
> stack :
> sk_skb_reason_drop+0x8f/0x120
> udp_queue_rcv_one_skb+0x14b/0x350
> udp_unicast_rcv_skb+0x71/0x90
> ip_protocol_deliver_rcu+0xa6/0x160
> ip_local_deliver_finish+0x90/0x100
> ip_sublist_rcv_finish+0x65/0x80
> ip_sublist_rcv+0x130/0x1c0
> ip_list_rcv+0xf7/0x130
> __netif_receive_skb_list_core+0x21d/0x240
> netif_receive_skb_list_internal+0x186/0x2b0
> napi_complete_done+0x78/0x190
> e1000_clean+0x27f/0x860
> __napi_poll+0x25/0x1e0
> net_rx_action+0x2ca/0x330
> handle_softirqs+0xbc/0x290
> irq_exit_rcu+0x90/0xb0
>
> It's more friendly to use and not need user application to cooperate.
> Furthermore, it is easier to add new feature. We can add reason/ip/port
> filter by debugfs parameters, like ftrace, rather than netlink msg.
I do not understand the fascination with net/core/drop_monitor.c,
which looks very old school to me,
and misses all the features, flexibility, scalability that 'perf',
eBPF tracing, bpftrace, .... have today.
Adding /sys/kernel/debug/drop_monitor/* is even more old school.
Not mentioning the maintenance burden.
For me the choice is easy :
# CONFIG_NET_DROP_MONITOR is not set
perf record -ag -e skb:kfree_skb sleep 1
perf script # or perf report
Powered by blists - more mailing lists