linux-kernel - Re: [PATCH] xen/xenbus: better handle backend crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <abd6f188-4c5e-4e56-9dbf-3bc942622b6f@suse.com>
Date: Mon, 9 Feb 2026 10:02:28 +0100
From: Jürgen Groß <jgross@...e.com>
To: Marek Marczykowski-Górecki
 <marmarek@...isiblethingslab.com>
Cc: linux-kernel@...r.kernel.org, Stefano Stabellini
 <sstabellini@...nel.org>,
 Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
 Peng Jiang <jiang.peng9@....com.cn>, Qiu-ji Chen <chenqiuji666@...il.com>,
 Jason Andryuk <jason.andryuk@....com>,
 "moderated list:XEN HYPERVISOR INTERFACE" <xen-devel@...ts.xenproject.org>
Subject: Re: [PATCH] xen/xenbus: better handle backend crash

On 06.02.26 17:57, Marek Marczykowski-Górecki wrote:
> On Thu, Jan 29, 2026 at 08:02:35AM +0100, Jürgen Groß wrote:
>> On 26.01.26 08:08, Jürgen Groß wrote:
>>> On 17.11.25 12:06, Jürgen Groß wrote:
>>>> On 02.11.25 04:20, Marek Marczykowski-Górecki wrote:
>>>>> When the backend domain crashes, coordinated device cleanup is not
>>>>> possible (as it involves waiting for the backend state change). In that
>>>>> case, toolstack forcefully removes frontend xenstore entries.
>>>>> xenbus_dev_changed() handles this case, and triggers device cleanup.
>>>>> It's possible that toolstack manages to connect new device in that
>>>>> place, before xenbus_dev_changed() notices the old one is missing. If
>>>>> that happens, new one won't be probed and will forever remain in
>>>>> XenbusStateInitialising.
>>>>>
>>>>> Fix this by checking backend-id and if it changes, consider it
>>>>> unplug+plug operation. It's important that cleanup on such unplug
>>>>> doesn't modify xenstore entries (especially the "state" key) as it
>>>>> belong to the new device to be probed - changing it would derail
>>>>> establishing connection to the new backend (most likely, closing the
>>>>> device before it was even connected). Handle this case by setting new
>>>>> xenbus_device->vanished flag to true, and check it before changing state
>>>>> entry.
>>>>>
>>>>> And even if xenbus_dev_changed() correctly detects the device was
>>>>> forcefully removed, the cleanup handling is still racy. Since this whole
>>>>> handling doesn't happend in a single xenstore transaction, it's possible
>>>>> that toolstack might put a new device there already. Avoid re-creating
>>>>> the state key (which in the case of loosing the race would actually
>>>>> close newly attached device).
>>>>>
>>>>> The problem does not apply to frontend domain crash, as this case
>>>>> involves coordinated cleanup.
>>>>>
>>>>> Problem originally reported at
>>>>> https://lore.kernel.org/xen-devel/aOZvivyZ9YhVWDLN@mail-itl/T/#t,
>>>>> including reproduction steps.
>>>>>
>>>>> Signed-off-by: Marek Marczykowski-Górecki <marmarek@...isiblethingslab.com>
>>>>
>>>> Sorry I didn't get earlier to this.
>>>>
>>>> My main problem with this patch is that it is basically just papering over
>>>> a more general problem.
>>>>
>>>> You are just making the problem much more improbable, but not impossible to
>>>> occur again. In case the new driver domain has the same domid as the old one
>>>> you can still have the same race.
>>>>
>>>> The clean way to handle that would be to add a unique Id in Xenstore to each
>>>> device on the backend side, which can be tested on the frontend side to
>>>> match. In case it doesn't match, an old device with the same kind and devid
>>>> can be cleaned up.
>>>>
>>>> The unique Id would obviously need to be set by the Xen tools inside the
>>>> transaction writing the initial backend Xenstore nodes, as doing that from
>>>> the backend would add another potential ambiguity by the driver domain
>>>> choosing the same unique id as the previous one did.
>>>>
>>>> The question is whether something like your patch should be used as a
>>>> fallback in case there is no unique Id on the backend side of the device
>>>> due to a too old Xen version.
>>>
>>> I think I have found a solution which is much more simple, as it doesn't
>>> need any change of the protocol or any addition of new identifiers.
>>>
>>> When creating a new PV device, Xen tools will always write all generic
>>> frontend- and backend-nodes. This includes the frontend state, which is
>>> initialized as XenbusStateInitialising.
>>>
>>> The Linux kernel's xenbus driver is already storing the last known state
>>> of a xenbus device in struct xenbus_device. When changing the state, the
>>> xenbus driver is even reading the state from Xenstore (even if only for
>>> making sure the path is still existing). So all what is needed is to check,
>>> whether the read current state is matching the locally saved state. If it
>>> is not matching AND the read state is XenbusStateInitialising, you can be
>>> sure that the backend has been replaced.
>>>
>>> Handling this will need to check the return value of xenbus_switch_state()
>>> in all related drivers, but this is just a more or less mechanical change.
>>>
>>> I'll prepare a patch series for that.
>>
>> In the end the result is more like your patch, avoiding the need to modify
>> all drivers.
>>
>> I just added my idea to your patch and modified some of your code to be more
>> simple. I _think_ I have covered all possible scenarios now, resulting in
>> the need to keep the backend id check in case the backend died during the
>> early init phase of the device.
>>
>> Could you please verify the attached patch is working for you?
> 
> Thanks for the patch!
> 
> I ran it through relevant tests, and I got inconsistent results.
> Specifically, sometimes, the domU hangs (actually, just one vCPU spins
> inside xenwatch thread). Last console messages are:
> 
>      systemd[626]: Starting dconf.service - User preferences database...
>      gnome-keyring-daemon[975]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
>      gnome-keyring-daemon[975]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
>      xen vif-0: xenbus: state reset occurred, reconnecting
>      gnome-keyring-daemon[974]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
>      gnome-keyring-daemon[976]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
>      gnome-keyring-daemon[976]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
>      gnome-keyring-daemon[974]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
>      xen vif-0: xenbus: state reset occurred, reconnecting
>      systemd[626]: Started dconf.service - User preferences database.
>      xen_netfront: Initialising Xen virtual ethernet driver
>      vif vif-0: xenbus: state reset occurred, reconnecting
> 
> And the call trace of the spinning xenwatch thread is:
> 
>      task:xenwatch        state:D stack:0     pid:64    tgid:64    ppid:2      task_flags:0x288040 flags:0x00080000
>      Call Trace:
>       <TASK>
>       __schedule+0x2f3/0x780
>       schedule+0x27/0x80
>       xs_wait_for_reply+0xab/0x1f0
>       ? __pfx_autoremove_wake_function+0x10/0x10
>       xs_talkv+0xec/0x200
>       xs_single+0x4a/0x70
>       xenbus_gather+0xe4/0x1a0
>       xenbus_read_driver_state+0x42/0x70
>       xennet_bus_close+0x113/0x2c0 [xen_netfront]
>       ? __pfx_autoremove_wake_function+0x10/0x10
>       xennet_remove+0x16/0x80 [xen_netfront]
>       xenbus_dev_remove+0x71/0xf0
>       device_release_driver_internal+0x19c/0x200
>       bus_remove_device+0xc6/0x130
>       device_del+0x160/0x3e0
>       device_unregister+0x17/0x60
>       xenbus_dev_changed.cold+0x5e/0x6b
>       ? __pfx_xenwatch_thread+0x10/0x10
>       xenwatch_thread+0x92/0x1c0
>       ? __pfx_autoremove_wake_function+0x10/0x10
>       kthread+0xfc/0x240
>       ? __pfx_kthread+0x10/0x10
>       ret_from_fork+0xf5/0x110
>       ? __pfx_kthread+0x10/0x10
>       ret_from_fork_asm+0x1a/0x30
>       </TASK>
>      task:xenbus          state:S stack:0     pid:63    tgid:63    ppid:2      task_flags:0x208040 flags:0x00080000
>      Call Trace:
>       <TASK>
>       __schedule+0x2f3/0x780
>       ? __pfx_xenbus_thread+0x10/0x10
>       schedule+0x27/0x80
>       xenbus_thread+0x1a8/0x200
>       ? __pfx_autoremove_wake_function+0x10/0x10
>       kthread+0xfc/0x240
>       ? __pfx_kthread+0x10/0x10
>       ret_from_fork+0xf5/0x110
>       ? __pfx_kthread+0x10/0x10
>       ret_from_fork_asm+0x1a/0x30
>       </TASK>
> 
> (technically, `top` says it's the xenbus thread spinning, but it looks
> like the actual issue is in xenwatch one)
> 
> Note that other xenwatch actions in this domU are not executed, for
> example `xl sysrq` does nothing. Not surprising, given xenwatch thread
> is busy... Fortunately, it blocks only a single vCPU, so I'm able to
> interact with the domU over console (to get the above traces).
> 
> It isn't a reliable failure, in this test run it failed once, out of 4
> related tests.
> 
> The specific test is: https://github.com/QubesOS/qubes-core-admin/blob/main/qubes/tests/integ/network.py#L234
> In short:
> 1. Start a domU
> 2. Pause it
> 3. Attach network (backend is != dom0)
> 4. Unpause
> 
> TBH, I'm not sure why the "state reset occurred" message is triggered at
> all, I think it shouldn't be in this case...
> 

Second try.


Juergen

View attachment "0001-xenbus-add-xenbus_device-parameter-to-xenbus_read_dr.patch" of type "text/x-patch" (10690 bytes)

View attachment "0002-xen-xenbus-better-handle-backend-crash.patch" of type "text/x-patch" (6192 bytes)

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3684 bytes)

Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (496 bytes)