[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6c6eee2832c658d689895aa9585fd30f54ab3ed9.camel@redhat.com>
Date: Fri, 02 Jul 2021 16:06:24 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Matthias Treydte <mt@...dheinz.de>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: David Ahern <dsahern@...il.com>, stable@...r.kernel.org,
netdev@...r.kernel.org, regressions@...ts.linux.dev,
davem@...emloft.net, yoshfuji@...ux-ipv6.org, dsahern@...nel.org
Subject: Re: [regression] UDP recv data corruption
Hello,
On Fri, 2021-07-02 at 14:36 +0200, Matthias Treydte wrote:
> And to answer Paolo's questions from his mail to the list (@Paolo: I'm
> not subscribed, please also send to me directly so I don't miss your mail)
(yup, that is what I did ?!?)
> > Could you please:
> > - tell how frequent is the pkt corruption, even a rough estimate of the
> > frequency.
>
> # journalctl --since "5min ago" | grep "Packet corrupt" | wc -l
> 167
>
> So there are 167 detected failures in 5 minutes, while the system is receiving
> at a moderate rate of about 900 pkts/s (according to Prometheus' node exporter
> at least, but seems about right)
Intersting. The relevant UDP GRO features are already off, and this
happens infrequently. Something is happening on a per packet basis, I
can't guess what.
It looks like you should be able to collect more info WRT the packet
corruption enabling debug logging at ffmpeg level, but I guess that
will flood the logfile.
If you have the kernel debuginfo and the 'perf' tool available, could
you please try:
perf probe -a 'udp_gro_receive sk sk->__sk_common.skc_dport'
perf probe -a 'udp_gro_receive_segment'
# neet to wait until at least a pkt corruption happens, 10 second
# should be more then enough
perf record -a -e probe:udp_gro_receive -e probe:udp_gro_receive_segment sleep 10
perf script | gzip > perf_script.gz
and share the above? I fear it could be too big for the ML, feel free
to send it directly to me.
> Next I'll try to capture some broken packets and reply in a separate mail,
> I'll have to figure out a good way to do this first.
Looks like there is corrupted packet every ~2K UDP ones. If you capture
a few thousends consecutive ones, than wireshark should probably help
finding the suspicious ones.
Thanks!
Paolo
Powered by blists - more mailing lists