netdev - Re: [PATCH] xen-netback: Check for hotplug-status existence before watching

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YJmMvTkp2Y1hlLLm@mail-itl>
Date:   Mon, 10 May 2021 21:42:52 +0200
From:   Marek Marczykowski-Górecki 
        <marmarek@...isiblethingslab.com>
To:     Michael Brown <mbrown@...systems.co.uk>, paul@....org
Cc:     paul@....org, xen-devel@...ts.xenproject.org,
        netdev@...r.kernel.org, wei.liu@...nel.org, pdurrant@...zon.com
Subject: Re: [PATCH] xen-netback: Check for hotplug-status existence before
 watching

On Mon, May 10, 2021 at 08:06:55PM +0100, Michael Brown wrote:
> If you have a suggested patch, I'm happy to test that it doesn't reintroduce
> the regression bug that was fixed by this commit.

Actually, I've just tested with a simple reloading xen-netfront module. It
seems in this case, the hotplug script is not re-executed. In fact, I
think it should not be re-executed at all, since the vif interface
remains in place (it just gets NO-CARRIER flag).

This brings a question, why removing hotplug-status in the first place?
The interface remains correctly configured by the hotplug script after
all. From the commit message:

    xen-netback: remove 'hotplug-status' once it has served its purpose

    Removing the 'hotplug-status' node in netback_remove() is wrong; the script
    may not have completed. Only remove the node once the watch has fired and
    has been unregistered.

I think the intention was to remove 'hotplug-status' node _later_ in
case of quickly adding and removing the interface. Is that right, Paul?
In that case, letting hotplug_status_changed() remove the entry wont
work, because the watch was unregistered few lines earlier in
netback_remove(). And keeping the watch is not an option, because the
whole backend_info struct is going to be free-ed already.

If my guess about the original reason for the change is right, I think
it should be fixed at the hotplug script level - it should check if the
device is still there before writing 'hotplug-status' node. I'm not sure
if doing it race-free is possible from a shell script (I think it
requires doing xenstore read _and_ write in a single transaction). But
in the worst case, the aftermath of loosing the race is leaving stray
'hotplug-status' xenstore node - not ideal, but also less harmful than
failing to bring up an interface. At this point, the toolstack could cleanup
it later, perhaps while setting up that interface again (if it gets
re-connected)?

Anyway, perhaps the best thing to do now, is to revert both commits, and
think of an alternative solution for the original issue? That of course
assumes I guessed correctly why it was done in the first place...

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)