[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YJl8IC7EbXKpARWL@mail-itl>
Date: Mon, 10 May 2021 20:32:00 +0200
From: Marek Marczykowski-Górecki
<marmarek@...isiblethingslab.com>
To: Michael Brown <mbrown@...systems.co.uk>
Cc: paul@....org, xen-devel@...ts.xenproject.org,
netdev@...r.kernel.org, wei.liu@...nel.org, pdurrant@...zon.com
Subject: Re: [PATCH] xen-netback: Check for hotplug-status existence before
watching
On Tue, Apr 13, 2021 at 04:25:12PM +0100, Michael Brown wrote:
> The logic in connect() is currently written with the assumption that
> xenbus_watch_pathfmt() will return an error for a node that does not
> exist. This assumption is incorrect: xenstore does allow a watch to
> be registered for a nonexistent node (and will send notifications
> should the node be subsequently created).
>
> As of commit 1f2565780 ("xen-netback: remove 'hotplug-status' once it
> has served its purpose"), this leads to a failure when a domU
> transitions into XenbusStateConnected more than once. On the first
> domU transition into Connected state, the "hotplug-status" node will
> be deleted by the hotplug_status_changed() callback in dom0. On the
> second or subsequent domU transition into Connected state, the
> hotplug_status_changed() callback will therefore never be invoked, and
> so the backend will remain stuck in InitWait.
>
> This failure prevents scenarios such as reloading the xen-netfront
> module within a domU, or booting a domU via iPXE. There is
> unfortunately no way for the domU to work around this dom0 bug.
>
> Fix by explicitly checking for existence of the "hotplug-status" node,
> thereby creating the behaviour that was previously assumed to exist.
This change is wrong. The 'hotplug-status' node is created _only_ by a
hotplug script and done so when it's executed. When kernel waits for
hotplug script to be executed it waits for the node to _appear_, not
_change_. So, this change basically made the kernel not waiting for the
hotplug script at all.
Furthermore, there is an additional side effect: in case of a driver
domain, xl devd may be started after the backend node is created (this
may happen if you start the frontend domain in parallel with the
backend). In this case, 'xl devd' will see the vif backend in
XenbusStateConnected state already and will not execute hotplug script
at all.
I think the proper fix is to re-register the watch when necessary,
instead of not registering it at all.
> Signed-off-by: Michael Brown <mbrown@...systems.co.uk>
> ---
> drivers/net/xen-netback/xenbus.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index a5439c130130..d24b7a7993aa 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -824,11 +824,15 @@ static void connect(struct backend_info *be)
> xenvif_carrier_on(be->vif);
>
> unregister_hotplug_status_watch(be);
> - err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, NULL,
> - hotplug_status_changed,
> - "%s/%s", dev->nodename, "hotplug-status");
> - if (!err)
> + if (xenbus_exists(XBT_NIL, dev->nodename, "hotplug-status")) {
> + err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch,
> + NULL, hotplug_status_changed,
> + "%s/%s", dev->nodename,
> + "hotplug-status");
> + if (err)
> + goto err;
> be->have_hotplug_status_watch = 1;
> + }
>
> netif_tx_wake_all_queues(be->vif->dev);
>
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists