[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bcb84f88-4bdd-4095-b5ea-e806e7733a54@huawei.com>
Date: Thu, 16 Oct 2025 14:26:33 +0800
From: Wang Liang <wangliang74@...wei.com>
To: Eric Dumazet <edumazet@...gle.com>, Florian Westphal <fw@...len.de>, Simon
Horman <horms@...nel.org>
CC: <nhorman@...driver.com>, <davem@...emloft.net>, <kuba@...nel.org>,
<pabeni@...hat.com>, <netdev@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <yuehaibing@...wei.com>,
<zhangchangzhong@...wei.com>
Subject: Re: [PATCH RFC net-next] net: drop_monitor: Add debugfs support
在 2025/10/15 18:40, Eric Dumazet 写道:
> On Wed, Oct 15, 2025 at 2:51 AM Wang Liang <wangliang74@...wei.com> wrote:
>> This patch add debugfs interfaces for drop monitor. Similar to kmemleak, we
>> can use the monitor by below commands:
>>
>> echo clear > /sys/kernel/debug/drop_monitor/trace
>> echo start > /sys/kernel/debug/drop_monitor/trace
>> echo stop > /sys/kernel/debug/drop_monitor/trace
>> cat /sys/kernel/debug/drop_monitor/trace
>>
>> The trace skb number limit can be set dynamically:
>>
>> cat /sys/kernel/debug/drop_monitor/trace_limit
>> echo 200 > /sys/kernel/debug/drop_monitor/trace_limit
>>
>> Compare to original netlink method, the callstack dump is supported. There
>> is a example for received udp packet with error checksum:
>>
>> reason : UDP_CSUM (11)
>> pc : udp_queue_rcv_one_skb+0x14b/0x350
>> len : 12
>> protocol : 0x0800
>> stack :
>> sk_skb_reason_drop+0x8f/0x120
>> udp_queue_rcv_one_skb+0x14b/0x350
>> udp_unicast_rcv_skb+0x71/0x90
>> ip_protocol_deliver_rcu+0xa6/0x160
>> ip_local_deliver_finish+0x90/0x100
>> ip_sublist_rcv_finish+0x65/0x80
>> ip_sublist_rcv+0x130/0x1c0
>> ip_list_rcv+0xf7/0x130
>> __netif_receive_skb_list_core+0x21d/0x240
>> netif_receive_skb_list_internal+0x186/0x2b0
>> napi_complete_done+0x78/0x190
>> e1000_clean+0x27f/0x860
>> __napi_poll+0x25/0x1e0
>> net_rx_action+0x2ca/0x330
>> handle_softirqs+0xbc/0x290
>> irq_exit_rcu+0x90/0xb0
>>
>> It's more friendly to use and not need user application to cooperate.
>> Furthermore, it is easier to add new feature. We can add reason/ip/port
>> filter by debugfs parameters, like ftrace, rather than netlink msg.
> I do not understand the fascination with net/core/drop_monitor.c,
> which looks very old school to me,
> and misses all the features, flexibility, scalability that 'perf',
> eBPF tracing, bpftrace, .... have today.
>
> Adding /sys/kernel/debug/drop_monitor/* is even more old school.
>
> Not mentioning the maintenance burden.
>
> For me the choice is easy :
>
> # CONFIG_NET_DROP_MONITOR is not set
>
> perf record -ag -e skb:kfree_skb sleep 1
>
> perf script # or perf report
Thank you for taking time to review this patch!
My initially thought was that the drop_monitor may cover more drop
positions (not just kfree_skb), show more skb info, filter skb by ip/port
debugfs parameter (not support now), and not need userspace tools.
Currently perf is indeed a better choice, adding debugfs is not necessary.
Thanks!
Powered by blists - more mailing lists