lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 6 Jul 2023 10:49:44 -0700
From: Ivan Babrou <ivan@...udflare.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	kernel-team@...udflare.com, 
	Willem de Bruijn <willemdebruijn.kernel@...il.com>, "David S. Miller" <davem@...emloft.net>, 
	David Ahern <dsahern@...nel.org>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Neil Horman <nhorman@...driver.com>, 
	Satoru Moriya <satoru.moriya@....com>
Subject: Re: [PATCH] udp6: add a missing call into udp_fail_queue_rcv_skb tracepoint

On Thu, Jul 6, 2023 at 10:39 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> Hi,
>
> On Thu, 2023-07-06 at 10:22 -0700, Ivan Babrou wrote:
> > The tracepoint has existed for 12 years, but it only covered udp
> > over the legacy IPv4 protocol. Having it enabled for udp6 removes
> > the unnecessary difference in error visibility.
> >
> > Signed-off-by: Ivan Babrou <ivan@...udflare.com>
> > Fixes: 296f7ea75b45 ("udp: add tracepoints for queueing skb to rcvbuf")
> > ---
> >  net/ipv6/udp.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index e5a337e6b970..debb98fb23c0 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -45,6 +45,7 @@
> >  #include <net/tcp_states.h>
> >  #include <net/ip6_checksum.h>
> >  #include <net/ip6_tunnel.h>
> > +#include <trace/events/udp.h>
> >  #include <net/xfrm.h>
> >  #include <net/inet_hashtables.h>
> >  #include <net/inet6_hashtables.h>
> > @@ -680,6 +681,7 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> >               }
> >               UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
> >               kfree_skb_reason(skb, drop_reason);
> > +             trace_udp_fail_queue_rcv_skb(rc, sk);
> >               return -1;
> >       }
>
> The patch looks correct and consistency is a nice thing, but I'm
> wondering if we should instead remove the tracepoint from the UDP v4
> code? We already have drop reason and MIBs to pin-point quite
> accurately UDP drops, and the trace point does not cover a few UDPv4
> spots (e.g. mcast). WDYT?

We are using this tracepoint in production monitoring:

* https://github.com/cloudflare/ebpf_exporter/blob/master/examples/udp-drops.bpf.c

It gives us a metric with a port and through internal port ownership
we can automatically notify the responsible people to address the
issue. It is not possible with MIB, as it lacks the port information.

As for kfree_skb, it is much higher frequency (literally infinitely
more frequent in a happy state):

$ sudo perf stat -a -e skb:kfree_skb,udp:udp_fail_queue_rcv_skb -- sleep 10
            70,546      skb:kfree_skb
                 0      udp:udp_fail_queue_rcv_skb

It would be a lot more expensive to use kfree_skb to drive the metric
we have today. It would be even more expensive for machines that have
high bandwidth traffic, since they would see a lot more skbs (the one
above is not that busy).

As a matter of fact, I have a local patch to introduce a tracepoint
for tcp listen drops with the similar reasoning, waiting for net-next
to open.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ