linux-kernel - Re: [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <57960530.7090608@oracle.com>
Date:	Mon, 25 Jul 2016 20:25:20 +0800
From:	Bob Liu <bob.liu@...cle.com>
To:	Roger Pau Monné <roger.pau@...rix.com>
CC:	linux-kernel@...r.kernel.org, xen-devel@...ts.xenproject.org,
	konrad.wilk@...cle.com, jgross@...e.com
Subject: Re: [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources


On 07/25/2016 08:11 PM, Roger Pau Monné wrote:
> On Mon, Jul 25, 2016 at 07:08:36PM +0800, Bob Liu wrote:
>>
>> On 07/25/2016 06:53 PM, Roger Pau Monné wrote:
>> ..[snip..]
>>>>>>  * We get the device lock and blk_mq_freeze_queue() in dynamic_reconfig_device(),
>>>>>>    and then have to release in blkif_recover() asynchronously which makes the code more difficult to readable.
>>>>>
>>>>> I'm not sure I'm following here, do you mean that you will pick the lock in 
>>>>> dynamic_reconfig_device and release it in blkif_recover? Why wouldn't you 
>>>>
>>>> Yes.
>>>>
>>>>> release the lock in dynamic_reconfig_device after doing whatever is needed?
>>>>>
>>>>
>>>> Both 'dynamic configuration' and migration:xenbus_dev_resume() use { blkfront_resume(); blkif_recover() } and depends on the change of xbdev->state.
>>>> If they happen simultaneously, the State machine of xbdev->state is going to be a mess and very difficult to track.
>>>>
>>>> The lock(actually a mutex) is like a big lock to make sure no race would happen at all.
>>>
>>> So the only thing that you should do is set the frontend state to closed and 
>>> wait for the backend to also switch to state closed, and then switch the
>>> frontend state to init again in order to trigger a reconnection.
>>>
>>
>> But if migration:xenbus_dev_resume() is triggered at the same time, any state be set might be changed.
>> =====
>> E.g
>> Dynamic_reconfig_device:                                Migration:xenbus_dev_resume()
>>
>> Set the frontend state to closed       
>>                  
>> 							^^^^
>> 							frontend state will be reset to XenbusStateInitialising by xenbus_dev_resume()
>>
>> Wait for the backend to also switch to state closed
> 
> This is not really a race, the backend will not switch to state closed, and 
> instead will appear at state init again, which is what we wanted anyway in 
> order to reconnect, so I don't see an issue with this flow.
> 
>> =====
>> Similar situation may happen at any other place regarding set the state.
> 
> Right, but I don't see how your proposed patch also avoids this issues. I 
> see that you pick the device lock in dynamic_reconfig_device, but I don't 
> see xenbus_dev_resume picking the lock at all, and in any case I don't think 

The lock is picked from the power management framework.

Anyway, I'm convinced and will follow your suggestion.
Thank you!

> you should prevent the VM from migrating.
> 
> Depending on the toolstack it might decide to kill the VM because it has not 
> been able to migrate it, in which case the result is not better.
>