netdev - Re: [PATCH net v3] virtio_net: Fix error unwinding of XDP initialization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1683596602.483001-1-xuanzhuo@linux.alibaba.com>
Date: Tue, 9 May 2023 09:43:22 +0800
From: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
To: Feng Liu <feliu@...dia.com>
Cc: Jason Wang <jasowang@...hat.com>,
 "Michael S . Tsirkin" <mst@...hat.com>,
 Simon Horman <simon.horman@...igine.com>,
 Bodong Wang <bodong@...dia.com>,
 William Tu <witu@...dia.com>,
 Parav Pandit <parav@...dia.com>,
 virtualization@...ts.linux-foundation.org,
 netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org,
 bpf@...r.kernel.org
Subject: Re: [PATCH net v3] virtio_net: Fix error unwinding of XDP initialization

On Mon, 8 May 2023 11:00:10 -0400, Feng Liu <feliu@...dia.com> wrote:
>
>
> On 2023-05-07 p.m.9:45, Xuan Zhuo wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Sat, 6 May 2023 08:08:02 -0400, Feng Liu <feliu@...dia.com> wrote:
> >>
> >>
> >> On 2023-05-05 p.m.10:33, Xuan Zhuo wrote:
> >>> External email: Use caution opening links or attachments
> >>>
> >>>
> >>> On Tue, 2 May 2023 20:35:25 -0400, Feng Liu <feliu@...dia.com> wrote:
> >>>> When initializing XDP in virtnet_open(), some rq xdp initialization
> >>>> may hit an error causing net device open failed. However, previous
> >>>> rqs have already initialized XDP and enabled NAPI, which is not the
> >>>> expected behavior. Need to roll back the previous rq initialization
> >>>> to avoid leaks in error unwinding of init code.
> >>>>
> >>>> Also extract a helper function of disable queue pairs, and use newly
> >>>> introduced helper function in error unwinding and virtnet_close;
> >>>>
> >>>> Issue: 3383038
> >>>> Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
> >>>> Signed-off-by: Feng Liu <feliu@...dia.com>
> >>>> Reviewed-by: William Tu <witu@...dia.com>
> >>>> Reviewed-by: Parav Pandit <parav@...dia.com>
> >>>> Reviewed-by: Simon Horman <simon.horman@...igine.com>
> >>>> Acked-by: Michael S. Tsirkin <mst@...hat.com>
> >>>> Change-Id: Ib4c6a97cb7b837cfa484c593dd43a435c47ea68f
> >>>> ---
> >>>>    drivers/net/virtio_net.c | 30 ++++++++++++++++++++----------
> >>>>    1 file changed, 20 insertions(+), 10 deletions(-)
> >>>>
> >>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>>> index 8d8038538fc4..3737cf120cb7 100644
> >>>> --- a/drivers/net/virtio_net.c
> >>>> +++ b/drivers/net/virtio_net.c
> >>>> @@ -1868,6 +1868,13 @@ static int virtnet_poll(struct napi_struct *napi, int budget)
> >>>>         return received;
> >>>>    }
> >>>>
> >>>> +static void virtnet_disable_qp(struct virtnet_info *vi, int qp_index)
> >>>> +{
> >>>> +     virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
> >>>> +     napi_disable(&vi->rq[qp_index].napi);
> >>>> +     xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
> >>>> +}
> >>>> +
> >>>>    static int virtnet_open(struct net_device *dev)
> >>>>    {
> >>>>         struct virtnet_info *vi = netdev_priv(dev);
> >>>> @@ -1883,20 +1890,26 @@ static int virtnet_open(struct net_device *dev)
> >>>>
> >>>>                 err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, vi->rq[i].napi.napi_id);
> >>>>                 if (err < 0)
> >>>> -                     return err;
> >>>> +                     goto err_xdp_info_reg;
> >>>>
> >>>>                 err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
> >>>>                                                  MEM_TYPE_PAGE_SHARED, NULL);
> >>>> -             if (err < 0) {
> >>>> -                     xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
> >>>> -                     return err;
> >>>> -             }
> >>>> +             if (err < 0)
> >>>> +                     goto err_xdp_reg_mem_model;
> >>>>
> >>>>                 virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
> >>>>                 virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
> >>>>         }
> >>>>
> >>>>         return 0;
> >>>> +
> >>>> +err_xdp_reg_mem_model:
> >>>> +     xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
> >>>> +err_xdp_info_reg:
> >>>> +     for (i = i - 1; i >= 0; i--)
> >>>> +             virtnet_disable_qp(vi, i);
> >>>
> >>>
> >>> I would to know should we handle for these:
> >>>
> >>>           disable_delayed_refill(vi);
> >>>           cancel_delayed_work_sync(&vi->refill);
> >>>
> >>>
> >>> Maybe we should call virtnet_close() with "i" directly.
> >>>
> >>> Thanks.
> >>>
> >>>
> >> Can’t use i directly here, because if xdp_rxq_info_reg fails, napi has
> >> not been enabled for current qp yet, I should roll back from the queue
> >> pairs where napi was enabled before(i--), otherwise it will hang at napi
> >> disable api
> >
> > This is not the point, the key is whether we should handle with:
> >
> >            disable_delayed_refill(vi);
> >            cancel_delayed_work_sync(&vi->refill);
> >
> > Thanks.
> >
> >
>
> OK, get the point. Thanks for your careful review. And I check the code
> again.
>
> There are two points that I need to explain:
>
> 1. All refill delay work calls(vi->refill, vi->refill_enabled) are based
> on that the virtio interface is successfully opened, such as
> virtnet_receive, virtnet_rx_resize, _virtnet_set_queues, etc. If there
> is an error in the xdp reg here, it will not trigger these subsequent
> functions. There is no need to call disable_delayed_refill() and
> cancel_delayed_work_sync().

Maybe something is wrong. I think these lines may call delay work.

static int virtnet_open(struct net_device *dev)
{
	struct virtnet_info *vi = netdev_priv(dev);
	int i, err;

	enable_delayed_refill(vi);

	for (i = 0; i < vi->max_queue_pairs; i++) {
		if (i < vi->curr_queue_pairs)
			/* Make sure we have some buffers: if oom use wq. */
-->			if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
-->				schedule_delayed_work(&vi->refill, 0);

		err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, vi->rq[i].napi.napi_id);
		if (err < 0)
			return err;

		err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
						 MEM_TYPE_PAGE_SHARED, NULL);
		if (err < 0) {
			xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
			return err;
		}

		virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
		virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
	}

	return 0;
}


And I think, if we virtnet_open() return error, then the status of virtnet
should like the status after virtnet_close().

Or someone has other opinion.

Thanks.

> The logic here is different from that of
> virtnet_close. virtnet_close is based on the success of virtnet_open and
> the tx and rx has been carried out normally. For error unwinding, only
> disable qp is needed. Also encapuslated a helper function of disable qp,
> which is used ing error unwinding and virtnet close
> 2. The current error qp, which has not enabled NAPI, can only call xdp
> unreg, and cannot call the interface of disable NAPI, otherwise the
> kernel will be stuck. So for i-- the reason for calling disable qp on
> the previous queue
>
> Thanks
>
> >>
> >>>> +
> >>>> +     return err;
> >>>>    }
> >>>>
> >>>>    static int virtnet_poll_tx(struct napi_struct *napi, int budget)
> >>>> @@ -2305,11 +2318,8 @@ static int virtnet_close(struct net_device *dev)
> >>>>         /* Make sure refill_work doesn't re-enable napi! */
> >>>>         cancel_delayed_work_sync(&vi->refill);
> >>>>
> >>>> -     for (i = 0; i < vi->max_queue_pairs; i++) {
> >>>> -             virtnet_napi_tx_disable(&vi->sq[i].napi);
> >>>> -             napi_disable(&vi->rq[i].napi);
> >>>> -             xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
> >>>> -     }
> >>>> +     for (i = 0; i < vi->max_queue_pairs; i++)
> >>>> +             virtnet_disable_qp(vi, i);
> >>>>
> >>>>         return 0;
> >>>>    }
> >>>> --
> >>>> 2.37.1 (Apple Git-137.1)
> >>>>