lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 23 Nov 2012 11:45:39 +0400
From:	Andrew Savchenko <bircoph@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	netdev@...r.kernel.org
Subject: Re: [BUG] Kernel recieves DNS reply, but doesn't deliver it to a
 waiting application

Hello,

On Sun, 21 Oct 2012 03:25:43 +0400 Andrew Savchenko wrote:
> > On Sat, 13 Oct 2012 15:44:20 +0200 Eric Dumazet wrote:
[...]
> > > You should investigate and check where the incoming packet is lost
> > > 
> > > Tools :
> > > 
> > > netstat -s
> > > 
> > > drop_monitor module and dropwatch command
> > > 
> > > cat /proc/net/udp
> > 
> > Thank you for you reply; I updated my kernel to 3.4.14, enabled
> > CONFIG_NET_DROP_MONITOR, and installed dropwatch utility.
> > 
> > I will report back when the bug will struck again.
> > This may take a weak or two, however.
> 
> This bug is back again on kernel 3.4.14, but this time I was able to
> get debug data and to recover running kernel without reboot.
> 
> Drowpatch showed that DNS UDP replies are always dropped here:
> 1 drops at __udp_queue_rcv_skb+61 (0xffffffff813bd670)
> 
> Another observations:
> - only UDP replies are lost, TCP works fine;
> - if network load is dropped dramatically (ip_forward disabled, most
> network daemons are stopped) UDP DNS queries work again; but with
> gradual load increase replies became first slow and than cease at all.
> - CPU load is very low (uptime is below 0.05), so this shouldn't be
> an insufficient computing power issue.
> 
> I found __udp_queue_rcv_skb function in net/ipv4/udp.c. From the code
> and observations above it follows that this is likely to be a ENOMEM
> condition leading to a packet loss.
[...]
> net.ipv4.udp_mem = 100000       150000  200000
> 
> This solved my issue, at least for a while: DNS queries are working
> fine now.

And this solved problem only temporary: after 40 days of uptime the
same problem struck again with the same observables. I "solved" this
by increasing udp memory again:

net.ipv4.udp_mem = 200000  300000  400000

Of course, this solution is only a temporary workaround. Such
behaviour increases my suspicions on some kind of memory leak.

This host is still on 3.4.14, however: can't reboot now due to
workload. Will try 3.7 branch as soon as this will be possible.

Best regards,
Andrew Savchenko

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ