[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a46bb3de011002c2446a6d836aaddc9f6bce71bc.camel@redhat.com>
Date: Thu, 06 Jul 2023 19:10:35 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Ian Kumlien <ian.kumlien@...il.com>
Cc: Eric Dumazet <edumazet@...gle.com>, Willem de Bruijn
<willemb@...gle.com>, Alexander Lobakin <aleksander.lobakin@...el.com>,
intel-wired-lan <intel-wired-lan@...ts.osuosl.org>, Jakub Kicinski
<kuba@...nel.org>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?
On Thu, 2023-07-06 at 18:17 +0200, Ian Kumlien wrote:
> On Thu, Jul 6, 2023 at 4:04 PM Paolo Abeni <pabeni@...hat.com> wrote:
> >
> > On Thu, 2023-07-06 at 15:56 +0200, Eric Dumazet wrote:
> > > On Thu, Jul 6, 2023 at 3:02 PM Paolo Abeni <pabeni@...hat.com> wrote:
> > > >
> > > > On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> > > > > On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <pabeni@...hat.com> wrote:
> > > > > > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > > > > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <pabeni@...hat.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <pabeni@...hat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > > > > > More stacktraces.. =)
> > > > > > > > > > >
> > > > > > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > > > > > net/ipv6/udp.c:787)
> > > > > > > > > >
> > > > > > > > > > I'm really running out of ideas here...
> > > > > > > > > >
> > > > > > > > > > This is:
> > > > > > > > > >
> > > > > > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > > > > > >
> > > > > > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > > > > > how a shared skb could land into the local input path.
> > > > > > > > > >
> > > > > > > > > > Anyway the other splats reported here and in later emails are
> > > > > > > > > > compatible with shared skbs.
> > > > > > > > > >
> > > > > > > > > > The above leads to another bunch of questions:
> > > > > > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > > > > > >
> > > > > > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > > > > > >
> > > > > > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > > > > > >
> > > > > > > > > default qdisc is fq
> > > > > > > >
> > > > > > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > > > > > devices.
> > > > > > > >
> > > > > > > > Could you please report the output of:
> > > > > > > >
> > > > > > > > tc -d -s qdisc show dev <tun dev name>
> > > > > > >
> > > > > > > I don't have these set:
> > > > > > > CONFIG_NET_SCH_INGRESS
> > > > > > > CONFIG_NET_SCHED
> > > > > > >
> > > > > > > so tc just gives an error...
> > > > > >
> > > > > > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > > > > > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
> > > > >
> > > > > Well it's still set in sysctl - dunno if it fails
> > > > >
> > > > > > Could you please share your kernel config?
> > > > >
> > > > > Sure...
> > > > >
> > > > > As a side note, it hasn't crashed - no traces since we did the last change
> > > >
> > > > It sounds like an encouraging sing! (last famous words...). I'll wait 1
> > > > more day, than I'll submit formally...
> > > >
> > > > > For reference, this is git diff on the running kernels source tree:
> > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > > > index cea28d30abb5..1b2394ebaf33 100644
> > > > > --- a/net/core/skbuff.c
> > > > > +++ b/net/core/skbuff.c
> > > > > @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > > > >
> > > > > skb_push(skb, -skb_network_offset(skb) + offset);
> > > > >
> > > > > + if (WARN_ON_ONCE(skb_shared(skb))) {
> > > > > + skb = skb_share_check(skb, GFP_ATOMIC);
> > > > > + if (!skb)
> > > > > + goto err_linearize;
> > > > > + }
> > > > > +
> > > > > + /* later code will clear the gso area in the shared info */
> > > > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > > > + if (err)
> > > > > + goto err_linearize;
> > > > > +
> > > > > skb_shinfo(skb)->frag_list = NULL;
> > > > >
> > > > > while (list_skb) {
> > > >
> > > > ...the above check only, as the other 2 should only catch-up side
> > > > effects of lack of this one. In any case the above address a real
> > > > issue, so we likely want it no-matter-what.
> > > >
> > >
> > > Interesting, I wonder if this could also fix some syzbot reports
> > > Willem and I are investigating.
> > >
> > > Any idea of when the bug was 'added' or 'revealed' ?
> >
> > The issue specifically addressed above should be present since
> > frag_list introduction commit 3a1296a38d0c ("net: Support GRO/GSO
> > fraglist chaining."). AFAICS triggering it requires non trivial setup -
> > mcast rx on bridge with frag-list enabled and forwarding to multiple
> > ports - so perhaps syzkaller found it later due to improvements on its
> > side ?!?
>
> I'm also a bit afraid that we just haven't triggered it - i don't see
> any warnings or anything... :/
Let me try to clarify: I hope/think that this chunk alone:
+ /* later code will clear the gso area in the shared info */
+ err = skb_header_unclone(skb, GFP_ATOMIC);
+ if (err)
+ goto err_linearize;
+
skb_shinfo(skb)->frag_list = NULL;
while (list_skb) {
does the magic/avoids the skb corruptions -> it everything goes well,
you should not see any warnings at all. Running 'nstat' in the DUT
should give some hints about reaching the relevant code paths.
Cheers,
Paolo
Powered by blists - more mailing lists