[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YJmMvTkp2Y1hlLLm@mail-itl>
Date: Mon, 10 May 2021 21:42:52 +0200
From: Marek Marczykowski-Górecki
<marmarek@...isiblethingslab.com>
To: Michael Brown <mbrown@...systems.co.uk>, paul@....org
Cc: paul@....org, xen-devel@...ts.xenproject.org,
netdev@...r.kernel.org, wei.liu@...nel.org, pdurrant@...zon.com
Subject: Re: [PATCH] xen-netback: Check for hotplug-status existence before
watching
On Mon, May 10, 2021 at 08:06:55PM +0100, Michael Brown wrote:
> If you have a suggested patch, I'm happy to test that it doesn't reintroduce
> the regression bug that was fixed by this commit.
Actually, I've just tested with a simple reloading xen-netfront module. It
seems in this case, the hotplug script is not re-executed. In fact, I
think it should not be re-executed at all, since the vif interface
remains in place (it just gets NO-CARRIER flag).
This brings a question, why removing hotplug-status in the first place?
The interface remains correctly configured by the hotplug script after
all. From the commit message:
xen-netback: remove 'hotplug-status' once it has served its purpose
Removing the 'hotplug-status' node in netback_remove() is wrong; the script
may not have completed. Only remove the node once the watch has fired and
has been unregistered.
I think the intention was to remove 'hotplug-status' node _later_ in
case of quickly adding and removing the interface. Is that right, Paul?
In that case, letting hotplug_status_changed() remove the entry wont
work, because the watch was unregistered few lines earlier in
netback_remove(). And keeping the watch is not an option, because the
whole backend_info struct is going to be free-ed already.
If my guess about the original reason for the change is right, I think
it should be fixed at the hotplug script level - it should check if the
device is still there before writing 'hotplug-status' node. I'm not sure
if doing it race-free is possible from a shell script (I think it
requires doing xenstore read _and_ write in a single transaction). But
in the worst case, the aftermath of loosing the race is leaving stray
'hotplug-status' xenstore node - not ideal, but also less harmful than
failing to bring up an interface. At this point, the toolstack could cleanup
it later, perhaps while setting up that interface again (if it gets
re-connected)?
Anyway, perhaps the best thing to do now, is to revert both commits, and
think of an alternative solution for the original issue? That of course
assumes I guessed correctly why it was done in the first place...
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists