linux-kernel - Re: Memory hotplug softlock issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181114100058.GK23419@dhcp22.suse.cz>
Date:   Wed, 14 Nov 2018 11:00:58 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Baoquan He <bhe@...hat.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org, aarcange@...hat.com, david@...hat.com,
        Vladimir Davydov <vdavydov.dev@...il.com>
Subject: Re: Memory hotplug softlock issue

[Cc Vladimir]

On Wed 14-11-18 15:09:09, Baoquan He wrote:
> Hi,
> 
> Tested memory hotplug on a bare metal system, hot removing always
> trigger a lock. Usually need hot plug/unplug several times, then the hot
> removing will hang there at the last block. Surely with memory pressure
> added by executing "stress -m 200".
> 
> Will attach the log partly. Any idea or suggestion, appreciated. 
> 
[...]
> [  +0.007169]       Not tainted 4.20.0-rc2+ #4
> [  +0.004630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  +0.008001] kworker/181:1   D    0  1187      2 0x80000000
> [  +0.005711] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
> [  +0.006467] Call Trace:
> [  +0.002591]  ? __schedule+0x24e/0x880
> [  +0.004995]  schedule+0x28/0x80
> [  +0.003380]  rwsem_down_read_failed+0x103/0x190
> [  +0.006528]  call_rwsem_down_read_failed+0x14/0x30
> [  +0.004937]  __percpu_down_read+0x4f/0x80
> [  +0.004204]  get_online_mems+0x2d/0x30
> [  +0.003871]  memcg_create_kmem_cache+0x1b/0x120
> [  +0.004740]  memcg_kmem_cache_create_func+0x1b/0x60
> [  +0.004986]  process_one_work+0x1a1/0x3a0
> [  +0.004255]  worker_thread+0x30/0x380
> [  +0.003764]  ? drain_workqueue+0x120/0x120
> [  +0.004238]  kthread+0x112/0x130
> [  +0.003320]  ? kthread_park+0x80/0x80
> [  +0.003796]  ret_from_fork+0x35/0x40

For a quick context. We do hold the exclusive mem hotplug lock
throughout the whole offlining and that can take quite some time.
So I am wondering whether we absolutely have to take the shared lock
in this path (introduced by 03afc0e25f7f ("slab: get_online_mems for
kmem_cache_{create,destroy,shrink}")). Is there any way to relax this
requirement? E.g. nodes stay around even when they are completely
offline. Does that help?
-- 
Michal Hocko
SUSE Labs