[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMArcTXMMy8p1QiPho7ckAvMLL+DH5oO3nTnFGbOAK8XMBqr2g@mail.gmail.com>
Date: Wed, 19 Jun 2024 16:34:00 +0900
From: Taehee Yoo <ap420073@...il.com>
To: "Nelson, Shannon" <shannon.nelson@....com>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
edumazet@...gle.com, brett.creeley@....com, drivers@...sando.io,
netdev@...r.kernel.org, jacob.e.keller@...el.com
Subject: Re: [PATCH net] ionic: fix kernel panic due to multi-buffer handling
On Wed, Jun 19, 2024 at 2:40 AM Nelson, Shannon <shannon.nelson@....com> wrote:
Hi Nelson, Shannon,
Thank you for the review!
>
> On 6/17/2024 9:14 PM, Taehee Yoo wrote:
> >
> > Currently, the ionic_run_xdp() doesn't handle multi-buffer packets
> > properly for XDP_TX and XDP_REDIRECT.
> > When a jumbo frame is received, the ionic_run_xdp() first makes xdp
> > frame with all necessary pages in the rx descriptor.
> > And if the action is either XDP_TX or XDP_REDIRECT, it should unmap
> > dma-mapping and reset page pointer to NULL for all pages, not only the
> > first page.
> > But it doesn't for SG pages. So, SG pages unexpectedly will be reused.
> > It eventually causes kernel panic.
> >
> > Oops: general protection fault, probably for non-canonical address 0x504f4e4dbebc64ff: 0000 [#1] PREEMPT SMP NOPTI
> > CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.10.0-rc3+ #25
> > RIP: 0010:xdp_return_frame+0x42/0x90
> > Code: 01 75 12 5b 4c 89 e6 5d 31 c9 41 5c 31 d2 41 5d e9 73 fd ff ff 44 8b 6b 20 0f b7 43 0a 49 81 ed 68 01 00 00 49 29 c5 49 01 fd <41> 80 7d0
> > RSP: 0018:ffff99d00122ce08 EFLAGS: 00010202
> > RAX: 0000000000005453 RBX: ffff8d325f904000 RCX: 0000000000000001
> > RDX: 00000000670e1000 RSI: 000000011f90d000 RDI: 504f4e4d4c4b4a49
> > RBP: ffff99d003907740 R08: 0000000000000000 R09: 0000000000000000
> > R10: 000000011f90d000 R11: 0000000000000000 R12: ffff8d325f904010
> > R13: 504f4e4dbebc64fd R14: ffff8d3242b070c8 R15: ffff99d0039077c0
> > FS: 0000000000000000(0000) GS:ffff8d399f780000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f41f6c85e38 CR3: 000000037ac30000 CR4: 00000000007506f0
> > PKRU: 55555554
> > Call Trace:
> > <IRQ>
> > ? die_addr+0x33/0x90
> > ? exc_general_protection+0x251/0x2f0
> > ? asm_exc_general_protection+0x22/0x30
> > ? xdp_return_frame+0x42/0x90
> > ionic_tx_clean+0x211/0x280 [ionic 15881354510e6a9c655c59c54812b319ed2cd015]
> > ionic_tx_cq_service+0xd3/0x210 [ionic 15881354510e6a9c655c59c54812b319ed2cd015]
> > ionic_txrx_napi+0x41/0x1b0 [ionic 15881354510e6a9c655c59c54812b319ed2cd015]
> > __napi_poll.constprop.0+0x29/0x1b0
> > net_rx_action+0x2c4/0x350
> > handle_softirqs+0xf4/0x320
> > irq_exit_rcu+0x78/0xa0
> > common_interrupt+0x77/0x90
> >
> > Fixes: 5377805dc1c0 ("ionic: implement xdp frags support")
> > Signed-off-by: Taehee Yoo <ap420073@...il.com>
> > ---
> >
> > In the XDP_DROP and XDP_ABORTED path, the ionic_rx_page_free() is called
> > to free page and unmap dma-mapping.
> > It resets only the first page pointer to NULL.
> > DROP/ABORTED path looks like it also has the same problem, but it's not.
> > Because all pages in the XDP_DROP/ABORTED path are not used anywhere it
> > can be reused without any free and unmap.
> > So, it's okay to remove the function in the xdp path.
> >
> > I will send a separated patch for removing ionic_rx_page_free() in the
> > xdp path.
> >
> > .../net/ethernet/pensando/ionic/ionic_txrx.c | 26 +++++++++++++++----
> > 1 file changed, 21 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
> > index 2427610f4306..792e530e3281 100644
> > --- a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
> > +++ b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
> > @@ -487,14 +487,13 @@ static bool ionic_run_xdp(struct ionic_rx_stats *stats,
> > struct ionic_buf_info *buf_info,
> > int len)
> > {
> > + int remain_len, frag_len, i, err = 0;
> > + struct skb_shared_info *sinfo;
> > u32 xdp_action = XDP_ABORTED;
> > struct xdp_buff xdp_buf;
> > struct ionic_queue *txq;
> > struct netdev_queue *nq;
> > struct xdp_frame *xdpf;
> > - int remain_len;
> > - int frag_len;
> > - int err = 0;
>
> There's no need to change these 3 declarations.
Okay, I will add a new declaration 'i', not change these declarations.
>
> >
> > xdp_init_buff(&xdp_buf, IONIC_PAGE_SIZE, rxq->xdp_rxq_info);
> > frag_len = min_t(u16, len, IONIC_XDP_MAX_LINEAR_MTU + VLAN_ETH_HLEN);
> > @@ -513,7 +512,6 @@ static bool ionic_run_xdp(struct ionic_rx_stats *stats,
> > */
> > remain_len = len - frag_len;
> > if (remain_len) {
> > - struct skb_shared_info *sinfo;
> > struct ionic_buf_info *bi;
> > skb_frag_t *frag;
> >
> > @@ -576,7 +574,6 @@ static bool ionic_run_xdp(struct ionic_rx_stats *stats,
> >
> > dma_unmap_page(rxq->dev, buf_info->dma_addr,
> > IONIC_PAGE_SIZE, DMA_FROM_DEVICE);
> > -
>
> You can leave the whitespace alone
Thanks, I will remove it :)
>
> > err = ionic_xdp_post_frame(txq, xdpf, XDP_TX,
> > buf_info->page,
> > buf_info->page_offset,
> > @@ -587,12 +584,22 @@ static bool ionic_run_xdp(struct ionic_rx_stats *stats,
> > goto out_xdp_abort;
> > }
> > buf_info->page = NULL;
> > + if (xdp_frame_has_frags(xdpf)) {
>
> You could use xdp_buff_has_frags(&xdp_buf) instead, or just drop this
> check and let nr_frags value handle it in the for-loop test, as long as
> you init sinfo->nr_frags = 0 at the start.
Right, xdp_buff_has_frags() is not needed here indeed.
I will remove it, Thanks!
>
>
> > + for (i = 0; i < sinfo->nr_frags; i++) {
> > + buf_info++;
> > + dma_unmap_page(rxq->dev, buf_info->dma_addr,
> > + IONIC_PAGE_SIZE, DMA_FROM_DEVICE);
> > + buf_info->page = NULL;
> > + }
> > + }
> > +
> > stats->xdp_tx++;
> >
> > /* the Tx completion will free the buffers */
> > break;
> >
> > case XDP_REDIRECT:
> > + xdpf = xdp_convert_buff_to_frame(&xdp_buf);
> > /* unmap the pages before handing them to a different device */
> > dma_unmap_page(rxq->dev, buf_info->dma_addr,
> > IONIC_PAGE_SIZE, DMA_FROM_DEVICE);
> > @@ -603,6 +610,15 @@ static bool ionic_run_xdp(struct ionic_rx_stats *stats,
> > goto out_xdp_abort;
> > }
> > buf_info->page = NULL;
> > + if (xdp_frame_has_frags(xdpf)) {
>
> If you use xdp_buff_has_frags() then you would not have to waste time
> with xdp_convert_buff_to_frame(). Or, again, if you init nr_flags at
> the start then you could skip the test altogether and let the for-loop
> test deal with it.
>
> > + for (i = 0; i < sinfo->nr_frags; i++) {
> > + buf_info++;
> > + dma_unmap_page(rxq->dev, buf_info->dma_addr,
> > + IONIC_PAGE_SIZE, DMA_FROM_DEVICE);
> > + buf_info->page = NULL;
> > + }
> > + }
> > +
>
> Since this is repeated code, maybe build a little function, something like
>
> static void ionic_xdp_rx_put_frags(struct ionic_queue *q,
> struct ionic_buf_info *buf_info,
> int nfrags)
> {
> int i;
>
> for (i = 0; i < nfrags; i++) {
> buf_info++;
> dma_unmap_page(rxq->dev, buf_info->dma_addr,
> IONIC_PAGE_SIZE, DMA_FROM_DEVICE);
> buf_info->page = NULL;
> }
> }
>
> and call this with
> ionic_xdp_rx_put_frags(rxq, buf_info, sinfo->nr_frags);
It looks great, I will apply your suggestion.
I will send a v2 patch with the above change :)
>
>
> Meanwhile, before removing ionic_rx_page_free(), be sure to check that
> ionic_rx_fill() notices that there are useful pages already there that
> can be reused and doesn't end up leaking pages.
>
> We have some draft patches for converting this all to page_pool, which I
> think takes care of this and some other messy bits. We've been side
> tracked internally, but really need to get back to those.
>
Oh, I also planned to send a patch for converting page_pool.
And at that patch, ionic_rx_page_free() was going to be removed.
But you're already preparing to send a patch for converting to page_pool,
so I will not send my page_pool patch. And I will test your page_pool patch.
I agree with the page_pool approach because I checked the
improvement performance and it makes the code a little bit simpler.
Thanks a lot for sharing your plan :)
> Thanks,
> sln
>
> > rxq->xdp_flush = true;
> > stats->xdp_redirect++;
> > break;
> > --
> > 2.34.1
> >
Thanks a lot!
Taehee Yoo
Powered by blists - more mailing lists