[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ8uoz2_nvDd+n_YfZZyd1m6xByQ6wo_D2HKSPRVi061+2M1RQ@mail.gmail.com>
Date: Sun, 25 Apr 2021 11:45:16 +0200
From: Magnus Karlsson <magnus.karlsson@...il.com>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: Jesper Dangaard Brouer <brouer@...hat.com>,
Lorenzo Bianconi <lorenzo@...nel.org>,
bpf <bpf@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>,
David Ahern <dsahern@...nel.org>,
Eelco Chaudron <echaudro@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Saeed Mahameed <saeed@...nel.org>,
"Fijalkowski, Maciej" <maciej.fijalkowski@...el.com>,
Tirthendu <tirthendu.sarkar@...el.com>
Subject: Re: Crash for i40e on net-next (was: [PATCH v8 bpf-next 00/14]
mvneta: introduce XDP multi-buffer support)
On Fri, Apr 23, 2021 at 6:43 PM Alexander Duyck
<alexander.duyck@...il.com> wrote:
>
> On Thu, Apr 22, 2021 at 10:28 PM Magnus Karlsson
> <magnus.karlsson@...il.com> wrote:
> >
> > On Thu, Apr 22, 2021 at 5:05 PM Jesper Dangaard Brouer
> > <brouer@...hat.com> wrote:
> > >
> > > On Thu, 22 Apr 2021 16:42:23 +0200
> > > Jesper Dangaard Brouer <brouer@...hat.com> wrote:
> > >
> > > > On Thu, 22 Apr 2021 12:24:32 +0200
> > > > Magnus Karlsson <magnus.karlsson@...il.com> wrote:
> > > >
> > > > > On Wed, Apr 21, 2021 at 5:39 PM Jesper Dangaard Brouer
> > > > > <brouer@...hat.com> wrote:
> > > > > >
> > > > > > On Wed, 21 Apr 2021 16:12:32 +0200
> > > > > > Magnus Karlsson <magnus.karlsson@...il.com> wrote:
> > > > > >
> > > > [...]
> > > > > > > more than I get.
> > > > > >
> > > > > > I clearly have a bug in the i40e driver. As I wrote later, I don't see
> > > > > > any packets transmitted for XDP_TX. Hmm, I using Mel Gorman's tree,
> > > > > > which contains the i40e/ice/ixgbe bug we fixed earlier.
> > > >
> > > > Something is wrong with i40e, I changed git-tree to net-next (at
> > > > commit 5d869070569a) and XDP seems to have stopped working on i40e :-(
> >
> > Found this out too when switching to the net tree yesterday to work on
> > proper packet drop tracing as you spotted/requested yesterday. The
> > commit below completely broke XDP support on i40e (if you do not run
> > with a zero-copy AF_XDP socket because that path still works). I am
> > working on a fix that does not just revert the patch, but fixes the
> > original problem without breaking XDP. Will post it and the tracing
> > fixes as soon as I can.
> >
> > commit 12738ac4754ec92a6a45bf3677d8da780a1412b3
> > Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@...el.com>
> > Date: Fri Mar 26 19:43:40 2021 +0100
> >
> > i40e: Fix sparse errors in i40e_txrx.c
> >
> > Remove error handling through pointers. Instead use plain int
> > to return value from i40e_run_xdp(...).
> >
> > Previously:
> > - sparse errors were produced during compilation:
> > i40e_txrx.c:2338 i40e_run_xdp() error: (-2147483647) too low for ERR_PTR
> > i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing
> > possible ERR_PTR()
> >
> > - sk_buff* was used to return value, but it has never had valid
> > pointer to sk_buff. Returned value was always int handled as
> > a pointer.
> >
> > Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
> > Fixes: 2e6893123830 ("i40e: split XDP_TX tail and XDP_REDIRECT map
> > flushing")
> > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
> > Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@...el.com>
> > Tested-by: Dave Switzer <david.switzer@...el.com>
> > Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
>
> Yeah, this patch would horribly break things, especially in the
> multi-buffer case. The idea behind using the skb pointer to indicate
> the error is that it is persistent until we hit the EOP descriptor.
> With that removed you end up mangling the entire list of frames since
> it will start trying to process the next frame in the middle of a
> packet.
>
> >
> > > Renamed subj as this is without this patchset applied.
> > >
> > > > $ uname -a
> > > > Linux broadwell 5.12.0-rc7-net-next+ #600 SMP PREEMPT Thu Apr 22 15:13:15 CEST 2021 x86_64 x86_64 x86_64 GNU/Linux
> > > >
> > > > When I load any XDP prog almost no packets are let through:
> > > >
> > > > [kernel-bpf-samples]$ sudo ./xdp1 i40e2
> > > > libbpf: elf: skipping unrecognized data section(16) .eh_frame
> > > > libbpf: elf: skipping relo section(17) .rel.eh_frame for section(16) .eh_frame
> > > > proto 17: 1 pkt/s
> > > > proto 0: 0 pkt/s
> > > > proto 17: 0 pkt/s
> > > > proto 0: 0 pkt/s
> > > > proto 17: 1 pkt/s
> > >
> > > Trying out xdp_redirect:
> > >
> > > [kernel-bpf-samples]$ sudo ./xdp_redirect i40e2 i40e2
> > > input: 7 output: 7
> > > libbpf: elf: skipping unrecognized data section(20) .eh_frame
> > > libbpf: elf: skipping relo section(21) .rel.eh_frame for section(20) .eh_frame
> > > libbpf: Kernel error message: XDP program already attached
> > > WARN: link set xdp fd failed on 7
> > > ifindex 7: 7357 pkt/s
> > > ifindex 7: 7909 pkt/s
> > > ifindex 7: 7909 pkt/s
> > > ifindex 7: 7909 pkt/s
> > > ifindex 7: 7909 pkt/s
> > > ifindex 7: 7909 pkt/s
> > > ifindex 7: 6357 pkt/s
> > >
> > > And then it crash (see below) at page_frag_free+0x31 which calls
> > > virt_to_head_page() with a wrong addr (I guess). This is called by
> > > i40e_clean_tx_irq+0xc9.
> >
> > Did not see a crash myself, just 4 Kpps. But the rings and DMA
> > mappings got completely mangled by the patch above, so could be the
> > same cause.
>
> Are you running with jumbo frames enabled? I would think this change
> would really blow things up in the jumbo enabled case.
I did not. Just using XDP_DROP or XDP_TX would crash the system just fine.
Powered by blists - more mailing lists