netdev - Re: [BUG] Kernel recieves DNS reply, but doesn't deliver it to a waiting application

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20121022073636.98462bc6.bircoph@gmail.com>
Date:	Mon, 22 Oct 2012 07:36:36 +0400
From:	Andrew Savchenko <bircoph@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	netdev@...r.kernel.org
Subject: Re: [BUG] Kernel recieves DNS reply, but doesn't deliver it to a
 waiting application

Hello,

On Sun, 21 Oct 2012 14:52:21 +0200 Eric Dumazet wrote:
> On Sun, 2012-10-21 at 03:25 +0400, Andrew Savchenko wrote:
[...]
> > This bug is back again on kernel 3.4.14, but this time I was able to
> > get debug data and to recover running kernel without reboot.
> > 
> > Drowpatch showed that DNS UDP replies are always dropped here:
> > 1 drops at __udp_queue_rcv_skb+61 (0xffffffff813bd670)
> > 
> > Another observations:
> > - only UDP replies are lost, TCP works fine;
> > - if network load is dropped dramatically (ip_forward disabled, most
> > network daemons are stopped) UDP DNS queries work again; but with
> > gradual load increase replies became first slow and than cease at all.
> > - CPU load is very low (uptime is below 0.05), so this shouldn't be
> > an insufficient computing power issue.
> > 
> > I found __udp_queue_rcv_skb function in net/ipv4/udp.c. From the code
> > and observations above it follows that this is likely to be a ENOMEM
> > condition leading to a packet loss.
> > 
> > This is a memory data after bug happened:
> > # cat /proc/meminfo
> > MemTotal:        1021576 kB
> > MemFree:           32056 kB
> > Buffers:          105204 kB
> > Cached:           646716 kB
> > SwapCached:          236 kB
> > Active:           205932 kB
> > Inactive:         587156 kB
> > Active(anon):      20636 kB
> > Inactive(anon):    22488 kB
> > Active(file):     185296 kB
> > Inactive(file):   564668 kB
> > Unevictable:        2152 kB
> > Mlocked:            2152 kB
> > SwapTotal:        995992 kB
> > SwapFree:         995020 kB
> > Dirty:                 0 kB
> > Writeback:             0 kB
> > AnonPages:         43120 kB
> > Mapped:             7504 kB
> > Shmem:               148 kB
> > Slab:             176004 kB
> > SReclaimable:     118636 kB
> > SUnreclaim:        57368 kB
> > KernelStack:         688 kB
> > PageTables:         2948 kB
> > NFS_Unstable:          0 kB
> > Bounce:                0 kB
> > WritebackTmp:          0 kB
> > CommitLimit:     1506780 kB
> > Committed_AS:      62708 kB
> > VmallocTotal:   34359738367 kB
> > VmallocUsed:      262732 kB
> > VmallocChunk:   34359474615 kB
> > AnonHugePages:         0 kB
> > DirectMap4k:       33536 kB
> > DirectMap2M:     1013760 kB
> > 
> > # sysctl -a | grep mem
> > net.core.optmem_max = 20480
> > net.core.rmem_default = 229376
> > net.core.rmem_max = 131071
> > net.core.wmem_default = 229376
> > net.core.wmem_max = 131071
> > net.ipv4.igmp_max_memberships = 20
> > net.ipv4.tcp_mem = 22350        29801   44700
> > net.ipv4.tcp_rmem = 4096        87380   6291456
> > net.ipv4.tcp_wmem = 4096        16384   4194304
> > net.ipv4.udp_mem = 24150        32202   48300
> > net.ipv4.udp_rmem_min = 4096
> > net.ipv4.udp_wmem_min = 4096
> > vm.lowmem_reserve_ratio = 256   256     32
> > vm.overcommit_memory = 0
> > 
> > Sysctl memory parameters are system defaults, I haven't changed them
> > via sysctl or /proc interfaces.
> > 
> > I tried to increase udm_mem values to the following:
> > net.ipv4.udp_mem = 100000       150000  200000
> > 
> > This solved my issue, at least for a while: DNS queries are working
> > fine now.
> > 
> > But I suspect that there is some memory loss in the kernel UDP stack,
> > because this issue never happens after reboot and always after about
> > a week of network operation. So this memory increase should help only
> > for a month or so, if memory loss is linear.
> > 
> > If you need some memory debug information, let me know which one and
> > what tools will be needed.
> 
> If drop is in __udp_queue_rcv_skb(), its not because you dont have
> enough memory. Frame was already received and handled by IP stack.
> 
> Thats because sock_queue_rcv_skb() said : there are already enough
> frames in socket receive buffer, I dont want to add another frame.

Actually there was a lot of dropwatch output. I decided that drops
are in __udp_queue_rcv_skb() by comparison between dropwatch outputs
under normal operation and under bug condition. I'm attaching both of
them.

> You forgot to attach :
> 
> cat /proc/net/udp
> netstat -s

Now bug is perfectly reproducible using net.ipv4.udp_mem values as
described in my previous mail. I reproduced reproducing bug
conditions and attaching these outputs.

> By the way, I suspect you are hit by skb recycling.
> (skb truesize is too big after few iterations)
> 
> We removed skb recycling in linux-3.7-rc1
> 
> If so, linux-3.7-rc1 or linux-3.7-rc2 should be fine.

I'll try 3.7 branch in a week or so. New kernel will require some
reconfiguration.

> What NIC are you using ?

This host has four NICs:
2x Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (10ec:8139)
2x D-Link System Inc DGE-528T Gigabit Ethernet Adapter (1462:223c)

One D-Link card is operating on 1Gbit/FD with mtu 7000, other cards
are used as 100Mbit/FD mtu 1500.

2 D-Link and 1 Realtek card are forming a bridge and remaining Realtek
card is used for an uplink. This host serves as a NAT between
the bridge and the uplink (both MASQUERADE and DNAT are used). Also it
has several ipsec tunnels for multiple hosts (mostly AH), l2tp tunnel
(independent from ipsec) and serves as a multicast router using
mrouted. Rather sophisticated ebtables, iptables and ipset setup is
used.

Best regards,
Andrew Savchenko

Download attachment "dropwatch.bug" of type "application/octet-stream" (6228 bytes)

Download attachment "dropwatch.normal" of type "application/octet-stream" (3613 bytes)

Download attachment "netstat-s" of type "application/octet-stream" (3429 bytes)

Download attachment "proc_net_udp" of type "application/octet-stream" (4073 bytes)

Content of type "application/pgp-signature" skipped