netdev - RE: [RFC][PATCH] add tracepoint to __sk_mem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <65795E11DBF1E645A09CEC7EAEE94B9C3FBC0709@USINDEVS02.corp.hds.com>
Date:	Wed, 15 Jun 2011 15:18:09 -0400
From:	Satoru Moriya <satoru.moriya@....com>
To:	Neil Horman <nhorman@...driver.com>
CC:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"dle-develop@...ts.sourceforge.net" 
	<dle-develop@...ts.sourceforge.net>,
	Seiji Aguchi <seiji.aguchi@....com>
Subject: RE: [RFC][PATCH] add tracepoint to __sk_mem_schedule

Hi Neil,

Thank you for looking at it.
I'm afraid that my explanation was not enough.

On 06/15/2011 07:07 AM, Neil Horman wrote:
> On Tue, Jun 14, 2011 at 03:24:14PM -0400, Satoru Moriya wrote:
>
> There are several ways to do this already.  Every drop that occurs in the stack
> should have a corresponding statistical counter exposed for it, and we also have
> a tracepoint in kfree_skb that the dropwatch system monitors to inform us of
> dropped packets in a certralized fashion.  Not saying this tracepoint isn't
> worthwhile, only that it covers already covered ground.

Right. We can catch packet drop events via dropwatch/kfree_skb tracepoint.
OTOH, what I'd like to know is why kernel dropped packets. Basically,
we're able to know the reason because kfree_skb tracepoint records its
return address. But in this case it is difficult to know a detailed reason
because there are some possibilities. I'll try to explain UDP case.

> Again, this is why dropwatch exists.  UDP gets into this path from:
> __udp_queue_rcv_skb
>  ip_queue_rcv_skb
>   sock_queue_rcv_skb
>    sk_rmem_schedule
>     __sk_mem_schedule
> 
> If ip_queue_rcv_skb fails we increment the UDP_MIB_RCVBUFERRORS counter as well
> as the UDP_MIB_INERRORS counter, and on the kfree_skb call after those
> increments, dropwatch will report the frame loss and the fact that it occured in
> __udp_queue_rcv_skb

As you said, we're able to catch the packet drop in __udp_queue_rcv_skb
and it means ip_queue_rcv_skb/sock_queue_rcv_skb returned negative value.
In sock_queue_rcv_skb there are 3 possibilities where it returns nagative
val but we can't separate them from the kfree_skb tracepoint. Moreover
sock_queue_rcv_skb calls __sk_mem_schedule and there are several
if-statement to decide whether kernel should drop a packet in it. I'd like
to know the condition when it returned error because some of them are
related to sysctl knob e.g. /proc/sys/net/ipv4/udp_mem, udp_rmem_min,
udp_wmem_min for UDP case and we can tune kernel behavior easily.
Also they are needed to show root cause of packet drop to our customers.

Does it make sense?

> I still think its an interesting tracepoint, just because it might be nice to
> know which sockets are expanding their snd/rcv buffer space, but why not modify
> the tracepoint so that it accepts the return code of __sk_mem_schedule and call
> it from both sk_rmem_schedule and sk_wmem_schedule.   That way you can use the
> tracepoint to record both successfull expansion and failed expansions.

It may be good but not enough for me as I mentioned above.

Regards,
Satoru--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html