lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 1 Mar 2017 07:52:18 -0800
From:   Dan Williams <dan.j.williams@...el.com>
To:     Heiko Carstens <heiko.carstens@...ibm.com>
Cc:     Michal Hocko <mhocko@...nel.org>,
        Sebastian Ott <sebott@...ux.vnet.ibm.com>,
        Linux MM <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Ben Hutchings <ben@...adent.org.uk>
Subject: Re: [PATCH] mm, add_memory_resource: hold device_hotplug lock over
 mem_hotplug_{begin, done}

On Wed, Mar 1, 2017 at 4:51 AM, Heiko Carstens
<heiko.carstens@...ibm.com> wrote:
> On Tue, Feb 28, 2017 at 12:57:29PM +0100, Heiko Carstens wrote:
>> On Mon, Feb 27, 2017 at 05:20:31PM +0100, Michal Hocko wrote:
>> > [CC Rafael]
>> >
>> > I've got lost in the acpi indirection (again). I can see
>> > acpi_device_hotplug calling lock_device_hotplug() but i cannot find a
>> > path down to add_memory() which might call add_memory_resource. But the
>> > patch below sounds suspicious to me. Is it possible that this could lead
>> > to a deadlock. I would suspect that it is the s390 code which needs to
>> > do the locking. But I would have to double check - it is really easy to
>> > get lost there.
>>
>> To me it rather looks like bfc8c90139eb ("mem-hotplug: implement
>> get/put_online_mems") introduced quite subtle and probably wrong locking
>> rules.
>>
>> The patch introduced mem_hotplug_begin() in order to have something like
>> cpu_hotplug_begin() for memory. Note that for cpu hotplug all
>> cpu_hotplug_begin() calls are serialized by cpu_maps_update_begin().
>>
>> Especially this makes sure that active_writer can only be changed by one
>> process. (See also Dan's commit which introduced the lock_device_hotplug()
>> calls: https://marc.info/?l=linux-kernel&m=148693912419972&w=2 )
>>
>> If you look at the above commit bfc8c90139eb: there is nothing like
>> cpu_maps_update_begin() for memory. And therefore it's possible to have
>> concurrent writers to active_writer.
>>
>> It looks like now lock_device_hotplug() is supposed to be the new
>> cpu_maps_update_begin() for memory. But.. this looks like a mess, unless I
>> read the code completely wrong ;)
>
> [Full quote since I now hopefully use a non-bouncing email address from
> Vladimir]
>
> Since it is anything but obvious why Dan wrote in changelog of b5d24fda9c3d
> ("mm, devm_memremap_pages: hold device_hotplug lock over
> mem_hotplug_{begin, done}") that write accesses to
> mem_hotplug.active_writer are coordinated via lock_device_hotplug() I'd
> rather propose a new private memory_add_remove_lock which has similar
> semantics like the cpu_add_remove_lock for cpu hotplug (see patch below).
>
> However instead of sprinkling locking/unlocking of that new lock around all
> calls of mem_hotplug_begin() and mem_hotplug_end() simply include locking
> and unlocking into these two functions.
>
> This still allows get_online_mems() and put_online_mems() to work, while at
> the same time preventing mem_hotplug.active_writer corruption.
>
> Any opinions?

Sorry, yes, I didn't make it clear that I derived that locking
requirement from store_mem_state() and its usage of
lock_device_hotplug_sysfs().

That routine is trying very hard not trip the soft-lockup detector. It
seems like that wants to be an interruptible wait.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ