lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f9a601af-4413-ed1d-f7f4-89343118a2f1@suse.com>
Date:   Tue, 17 Dec 2019 18:10:19 +0100
From:   Jürgen Groß <jgross@...e.com>
To:     SeongJae Park <sjpark@...zon.com>
Cc:     axboe@...nel.dk, konrad.wilk@...cle.com, roger.pau@...rix.com,
        linux-block@...r.kernel.org, pdurrant@...zon.com,
        SeongJae Park <sjpark@...zon.de>, linux-kernel@...r.kernel.org,
        sj38.park@...il.com, xen-devel@...ts.xenproject.org
Subject: Re: [Xen-devel] [PATCH v11 2/6] xenbus/backend: Protect xenbus
 callback with lock

On 17.12.19 17:24, SeongJae Park wrote:
> On Tue, 17 Dec 2019 17:13:42 +0100 "Jürgen Groß" <jgross@...e.com> wrote:
> 
>> On 17.12.19 17:07, SeongJae Park wrote:
>>> From: SeongJae Park <sjpark@...zon.de>
>>>
>>> 'reclaim_memory' callback can race with a driver code as this callback
>>> will be called from any memory pressure detected context.  To deal with
>>> the case, this commit adds a spinlock in the 'xenbus_device'.  Whenever
>>> 'reclaim_memory' callback is called, the lock of the device which passed
>>> to the callback as its argument is locked.  Thus, drivers registering
>>> their 'reclaim_memory' callback should protect the data that might race
>>> with the callback with the lock by themselves.
>>>
>>> Signed-off-by: SeongJae Park <sjpark@...zon.de>
>>> ---
>>>    drivers/xen/xenbus/xenbus_probe.c         |  1 +
>>>    drivers/xen/xenbus/xenbus_probe_backend.c | 10 ++++++++--
>>>    include/xen/xenbus.h                      |  2 ++
>>>    3 files changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>> index 5b471889d723..b86393f172e6 100644
>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>> @@ -472,6 +472,7 @@ int xenbus_probe_node(struct xen_bus_type *bus,
>>>    		goto fail;
>>>    
>>>    	dev_set_name(&xendev->dev, "%s", devname);
>>> +	spin_lock_init(&xendev->reclaim_lock);
>>>    
>>>    	/* Register with generic device framework. */
>>>    	err = device_register(&xendev->dev);
>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>> index 7e78ebef7c54..516aa64b9967 100644
>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>> @@ -251,12 +251,18 @@ static int backend_probe_and_watch(struct notifier_block *notifier,
>>>    static int backend_reclaim_memory(struct device *dev, void *data)
>>>    {
>>>    	const struct xenbus_driver *drv;
>>> +	struct xenbus_device *xdev;
>>> +	unsigned long flags;
>>>    
>>>    	if (!dev->driver)
>>>    		return 0;
>>>    	drv = to_xenbus_driver(dev->driver);
>>> -	if (drv && drv->reclaim_memory)
>>> -		drv->reclaim_memory(to_xenbus_device(dev));
>>> +	if (drv && drv->reclaim_memory) {
>>> +		xdev = to_xenbus_device(dev);
>>> +		spin_trylock_irqsave(&xdev->reclaim_lock, flags);
>>
>> You need spin_lock_irqsave() here. Or maybe spin_lock() would be fine,
>> too? I can't see a reason why you'd want to disable irqs here.
> 
> I needed to diable irq here as this is called from the memory shrinker context.

Okay.

> 
> Also, used 'trylock' because the 'probe()' and 'remove()' code of the driver
> might include memory allocation.  And the xen-blkback actually does.  If the
> allocation shows a memory pressure during the allocation, it will trigger this
> shrinker callback again and then deadlock.

In that case you need to either return when you didn't get the lock or

- when obtaining the lock during probe() and remove() set a variable
   containing the current cpu number
- and reset that to e.g NR_CPUS before releasing the lock again
- in the shrinker callback do trylock, and if you didn't get the lock
   test whether the cpu-variable above is set to your current cpu and
   continue only if yes; if not, redo the the trylock


Juergen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ