lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110615200405.GB23380@hmsreliant.think-freely.org>
Date:	Wed, 15 Jun 2011 16:04:05 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	Satoru Moriya <satoru.moriya@....com>
Cc:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"dle-develop@...ts.sourceforge.net" 
	<dle-develop@...ts.sourceforge.net>,
	Seiji Aguchi <seiji.aguchi@....com>
Subject: Re: [RFC][PATCH] add tracepoint to __sk_mem_schedule

On Wed, Jun 15, 2011 at 03:18:09PM -0400, Satoru Moriya wrote:
> Hi Neil,
> 
> Thank you for looking at it.
> I'm afraid that my explanation was not enough.
> 
> On 06/15/2011 07:07 AM, Neil Horman wrote:
> > On Tue, Jun 14, 2011 at 03:24:14PM -0400, Satoru Moriya wrote:
> >
> > There are several ways to do this already.  Every drop that occurs in the stack
> > should have a corresponding statistical counter exposed for it, and we also have
> > a tracepoint in kfree_skb that the dropwatch system monitors to inform us of
> > dropped packets in a certralized fashion.  Not saying this tracepoint isn't
> > worthwhile, only that it covers already covered ground.
> 
> Right. We can catch packet drop events via dropwatch/kfree_skb tracepoint.
> OTOH, what I'd like to know is why kernel dropped packets. Basically,
> we're able to know the reason because kfree_skb tracepoint records its
> return address. But in this case it is difficult to know a detailed reason
> because there are some possibilities. I'll try to explain UDP case.
Ok, but you're patch only gets us incrmentally closer to that goal.  You can
tell if theres too much memory pressue to accept another packet on a socket, but
it doesn't tell you if for instance if sk_filter indicated an error, or
something else failed.  In short it sounds to me like you're trying to debug a
specific problem, rather than add something generally useful.

> 
> > Again, this is why dropwatch exists.  UDP gets into this path from:
> > __udp_queue_rcv_skb
> >  ip_queue_rcv_skb
> >   sock_queue_rcv_skb
> >    sk_rmem_schedule
> >     __sk_mem_schedule
> > 
> > If ip_queue_rcv_skb fails we increment the UDP_MIB_RCVBUFERRORS counter as well
> > as the UDP_MIB_INERRORS counter, and on the kfree_skb call after those
> > increments, dropwatch will report the frame loss and the fact that it occured in
> > __udp_queue_rcv_skb
> 
> As you said, we're able to catch the packet drop in __udp_queue_rcv_skb
> and it means ip_queue_rcv_skb/sock_queue_rcv_skb returned negative value.
> In sock_queue_rcv_skb there are 3 possibilities where it returns nagative
> val but we can't separate them from the kfree_skb tracepoint. Moreover
This is true, but you're patch doesn't do that either (not that either feature
holds that as a specific goal).

> sock_queue_rcv_skb calls __sk_mem_schedule and there are several
> if-statement to decide whether kernel should drop a packet in it. I'd like
> to know the condition when it returned error because some of them are
> related to sysctl knob e.g. /proc/sys/net/ipv4/udp_mem, udp_rmem_min,
> udp_wmem_min for UDP case and we can tune kernel behavior easily.
> Also they are needed to show root cause of packet drop to our customers.
> 
> Does it make sense?
> 
Yes, it makes sense, but I think the patch could be made generally more useful.

> > I still think its an interesting tracepoint, just because it might be nice to
> > know which sockets are expanding their snd/rcv buffer space, but why not modify
> > the tracepoint so that it accepts the return code of __sk_mem_schedule and call
> > it from both sk_rmem_schedule and sk_wmem_schedule.   That way you can use the
> > tracepoint to record both successfull expansion and failed expansions.
> 
> It may be good but not enough for me as I mentioned above.
> 
> 
> Regards,
> Satoru--
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ