netdev - Re: xen-netback hotplug-status regression bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <58ccc3b7-9ccb-b9bf-84e7-4a023ccb5c56@ipxe.org>
Date:   Tue, 13 Apr 2021 11:48:21 +0100
From:   Michael Brown <mcb30@...e.org>
To:     paul@....org, Wei Liu <wei.liu@...nel.org>,
        xen-devel@...ts.xenproject.org, netdev@...r.kernel.org,
        Paul Durrant <pdurrant@...zon.com>
Subject: Re: xen-netback hotplug-status regression bug

On 13/04/2021 08:12, Paul Durrant wrote:
>> If the frontend subsequently disconnects and reconnects (e.g. 
>> transitions through Closed->Initialising->Connected) then:
>>
>> - Nothing recreates "hotplug-status"
>>
>> - When the frontend re-enters Connected state, connect() sets up a 
>> watch on "hotplug-status" again
>>
>> - The callback hotplug_status_changed() is never triggered, and so the 
>> backend device never transitions to Connected state.
> 
> That's not how I read it. Given that "hotplug-status" is removed by the 
> call to hotplug_status_changed() then the next call to connect() should 
> fail to register the watch and 'have_hotplug_status_watch' should be 0. 
> Thus backend_switch_state() should not defer the transition to 
> XenbusStateConnected in any subsequent interaction with the frontend.

Thank you for the reply.  I've tested and confirmed my initial 
hypothesis: the call to xenbus_watch_pathfmt() succeeds even if the node 
does not exist.

I confirmed this with ftrace using:

   cd /sys/kernel/debug/tracing
   echo function_graph > current_tracer
   echo set_backend_state > set_ftrace_filter
   echo xenbus_watch_pathfmt >> set_ftrace_filter
   echo register_xenbus_watch >> set_ftrace_filter
   echo xenbus_dev_fatal >> set_ftrace_filter

On the second time that the frontend transitions to Connected, this 
produced the trace:

   set_backend_state [xen_netback]() {
     register_xenbus_watch();
     register_xenbus_watch();
     xenbus_watch_pathfmt() {
       register_xenbus_watch();
     }
   }

which seems to confirm that the error path in xenbus_watch_path() is 
*not* taken, i.e. that the call to register_xenbus_watch() succeeded 
even though the node did not exist.


Other observations also seem to confirm this behaviour:

- Running "xenstore ls" in dom0 confirms that on the second frontend 
transition to Connected, the frontend state is indeed Connected (4) but 
the backend state remains in InitWait (2)

- Running "xenstore watch 
/local/domain/0/backend/vif/<domU>/0/hotplug-status" *before* starting 
the domU confirms that it is possible to create a watch on a node that 
does not (yet) exist, and that the watch *is* notified when the node is 
later created.

> Are you seeing the watch successfully re-registered even though the node 
> does not exist? Perhaps there has been a change in xenstore behaviour?

So, the TL;DR is that yes, the watch does successfully register even 
though the node does not exist.

 From a quick look through the xenstored source, it looks as though the 
only check on the node name is the call to is_valid_nodename(), which 
seems to perform a syntactic validity check only.  I can't immediately 
find any commit that would have changed this behaviour.

Thanks,

Michael