lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <PAXPR04MB851008E9B50351D84CD89A8D880C2@PAXPR04MB8510.eurprd04.prod.outlook.com>
Date: Wed, 25 Dec 2024 01:38:52 +0000
From: Wei Fang <wei.fang@....com>
To: Kevin Groeneveld <kgroeneveld@...brook.com>
CC: Shenwei Wang <shenwei.wang@....com>, Clark Wang <xiaoning.wang@....com>,
	Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo
 Abeni <pabeni@...hat.com>, "imx@...ts.linux.dev" <imx@...ts.linux.dev>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] net: fec: handle page_pool_dev_alloc_pages error

> I will simplify it.
> 
> >> @@ -1943,10 +1943,12 @@ static int fec_enet_rx_napi(struct napi_struct
> >> *napi, int budget)
> >>      struct fec_enet_private *fep = netdev_priv(ndev);
> >>      int done = 0;
> >>
> >> +    fep->rx_err_nomem = false;
> >> +
> >>      do {
> >>              done += fec_enet_rx(ndev, budget - done);
> >>              fec_enet_tx(ndev, budget);
> >> -    } while ((done < budget) && fec_enet_collect_events(fep));
> >> +    } while ((done < budget) && !fep->rx_err_nomem &&
> >> fec_enet_collect_events(fep));
> >
> > Is the condition "!fep->rx_err_nomem" necessary here? If not, then there
> > is no need to add this variable to fec_enet_private.
> 
> For my test case it often seems to loop forever without making any
> progress unless I add that condition.
> 
> > One situation I am concerned about is that when the issue occurs, the Rx
> > rings are full. At the same time, because the 'done < budget' condition is
> > met, the interrupt mode will be used to receive the packets. However,
> > since the Rx rings are full, no Rx interrupt events will be generated. This
> > means that the packets on the Rx rings may not be received by the CPU
> > for a long time unless Tx interrupt events are generated.
> 
> These are the types of things I was worried might exist with my patch.
> 
> > Another approach is to discard the packets when the issue occurs, as
> > shown below. Note that the following modification has not been verified.
> >
> > -static void fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
> > +static int fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
> >                                  struct bufdesc *bdp, int index)
> >   {
> >          struct page *new_page;
> >          dma_addr_t phys_addr;
> >
> >          new_page = page_pool_dev_alloc_pages(rxq->page_pool);
> > -       WARN_ON(!new_page);
> > +       if (unlikely(!new_page))
> > +               return -ENOMEM;
> > +
> >          rxq->rx_skb_info[index].page = new_page;
> >
> >          rxq->rx_skb_info[index].offset = FEC_ENET_XDP_HEADROOM;
> >          phys_addr = page_pool_get_dma_addr(new_page) +
> FEC_ENET_XDP_HEADROOM;
> >          bdp->cbd_bufaddr = cpu_to_fec32(phys_addr);
> > +
> > +       return 0;
> >   }
> >
> >   static u32
> > @@ -1771,7 +1775,10 @@ fec_enet_rx_queue(struct net_device *ndev, int
> budget, u16 queue_id)
> >                                          pkt_len,
> >                                          DMA_FROM_DEVICE);
> >                  prefetch(page_address(page));
> > -               fec_enet_update_cbd(rxq, bdp, index);
> > +               if (fec_enet_update_cbd(rxq, bdp, index)) {
> > +                       ndev->stats.rx_dropped++;
> > +                       goto rx_processing_done;
> > +               }
> >
> >                  if (xdp_prog) {
> >                          xdp_buff_clear_frags_flag(&xdp);
> 
> Thanks for the suggestion. I had considered something similar but I was
> not sure it was safe to just jump to rx_processing_done at that point in
> the code. I will try your patch and if it seems to work okay I will
> submit a new version.

Okay, but this is just an example, the official modification still needs to be
improved, such as moving 'goto rx_processing_done' statement before
dma_sync_single_for_cpu().

> 
> I probably will not have time to work on this further until the new year.
> 

Ok, thanks a lot for helping fix this.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ