lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <27467ed4-0520-8642-f4c7-6f4aeb54ef2a@pm.me>
Date:   Wed, 09 Oct 2019 17:05:02 +0000
From:   Nate Sweet <nathanjsweet@...me>
To:     netdev@...r.kernel.org
Subject: UDP Statistics Bug?

Hey net devs,

I would like some clarity on a problem I ran into last week. I was 
diagnosing a DNS issue last week and got very side tracked by how 
netstat reported stats to me. My issue was that UDP packets were being 
dropped by all UDP sockets on the host, so when I ran `nestat -naus` and 
it informed me that UdpInErrors 
(https://elixir.bootlin.com/linux/v5.4-rc2/source/include/uapi/linux/snmp.h#L156) 
was my main problem I spent a day trying to figure out what 
application/mechanism was dropping UDP packets on the host. My 
suspicion, based on the statistic I was seeing, was that it was going to 
be something like BPF or a security module. To be fair to me, these two 
mechanisms do indeed report their drops within this statistic 
(https://elixir.bootlin.com/linux/v5.4-rc2/source/net/ipv4/udp.c#L2051). 
Imagine my surprise when I discovered that the error that was actually 
happening, was that the global UDP socket min was being reached, and all 
the host UDP sockets were, indeed, experiencing buffer errors. The 
problem is that wihtin the regular UDP socket datapath 
`UDP_MIB_RCVBUFERRORS` only seem to be set here 
(https://elixir.bootlin.com/linux/v5.4-rc2/source/net/ipv4/udp.c#L1945) 
when the error is "ENOMEM". However, when `__sk_mem_raise_allocated` 
fails 
(https://elixir.bootlin.com/linux/v5.4-rc2/source/net/ipv4/udp.c#L1455) 
it reports "ENOBUF". The issue ended up being an application that was 
not processing it's backlog, because it wasn't closing old UDP sockets. 
IMO, I would have gotten to this dianosis quicker if when I ran `nestat 
-naus` I had gotten UdpRcvBuffErrors (`UDP_MIB_RCVBUFERRORS`) instead of 
UdpInErrors. I realize that it is too late to change this error 
reporting now, because it would break user space, but I think a new 
error could be added to the kernel for UDP, such as 
UdpRcvBuffGlobalErrors, or something like that, which could be double 
reported. I think this would be a real time saver for folks, because I 
really think UdpInErrors is counter-intuitively incorrect.

Thanks,

Nate Sweet



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ