[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4fa0a559-dd5a-8405-0533-37cfe6973eeb@redhat.com>
Date: Sat, 11 Jan 2020 15:52:13 +0100
From: David Hildenbrand <david@...hat.com>
To: Qian Cai <cai@....pw>
Cc: Dan Williams <dan.j.williams@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
stable <stable@...r.kernel.org>,
Vishal Verma <vishal.l.verma@...el.com>,
Pavel Tatashin <pasha.tatashin@...een.com>,
Michal Hocko <mhocko@...e.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Linux MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Greg KH <gregkh@...uxfoundation.org>
Subject: Re: [PATCH v4] mm/memory_hotplug: Fix remove_memory() lockdep splat
On 11.01.20 15:25, David Hildenbrand wrote:
>
>
>> Am 11.01.2020 um 14:56 schrieb Qian Cai <cai@....pw>:
>>
>>
>>
>>> On Jan 11, 2020, at 6:03 AM, David Hildenbrand <david@...hat.com> wrote:
>>>
>>> So I just remember why I think this (and the previously reported done
>>> for ACPI DIMMs) are false positives. The actual locking order is
>>>
>>> onlining/offlining from user space:
>>>
>>> kn->count -> device_hotplug_lock -> cpu_hotplug_lock -> mem_hotplug_lock
>>>
>>> memory removal:
>>>
>>> device_hotplug_lock -> cpu_hotplug_lock -> mem_hotplug_lock -> kn->count
>>>
>>>
>>> This looks like a locking inversion - but it's not. Whenever we come via
>>> user space we do a mutex_trylock(), which resolves this issue by backing
>>> up. The device_hotplug_lock will prevent
>>>
>>> I have no clue why the device_hotplug_lock does not pop up in the
>>> lockdep report here. Sounds wrong to me.
>>>
>>> I think this is a false positive and not stable material.
>>
>> The point is that there are other paths does kn->count —> cpu_hotplug_lock without needing device_hotplug_lock to race with memory removal.
>>
>> kmem_cache_shrink_all+0x50/0x100 (cpu_hotplug_lock.rw_sem/mem_hotplug_lock.rw_sem)
>> shrink_store+0x34/0x60
>> slab_attr_store+0x6c/0x170
>> sysfs_kf_write+0x70/0xb0
>> kernfs_fop_write+0x11c/0x270 ((kn->count)
>> __vfs_write+0x3c/0x70
>> vfs_write+0xcc/0x200
>> ksys_write+0x7c/0x140
>> system_call+0x5c/0x6
>>
>
> But not the lock of the memory devices, or am I missing something?
>
To clarify:
memory unplug will remove e.g., /sys/devices/system/memory/memoryX/,
which has a dedicated kn->count AFAIK
If you do a "echo 1 > /sys/kernel/slab/X/shrink", you would not lock the
kn->count of /sys/devices/system/memory/memoryX/, but the one of some
slab thingy.
The only scenario I could see is if remove_memory_block_devices() will
not only remove /sys/devices/system/memory/memoryX/, but also implicitly
e.g., /sys/kernel/slab/X/. If that is the case, then this is indeed not
a false positive, but something rather hard to trigger (which would
still classify as stable material).
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists