linux-kernel - Re: [PATCH] zsmalloc: use workqueue to destroy pool in zpool callback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALZtONCDqBjL9TFmUEwuHaNU3n55k0VwbYWqW-9dODuNWyzkLQ@mail.gmail.com>
Date:	Thu, 31 Mar 2016 18:05:35 -0400
From:	Dan Streetman <ddstreet@...e.org>
To:	Yu Zhao <yuzhao@...gle.com>
Cc:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Minchan Kim <minchan@...nel.org>,
	Nitin Gupta <ngupta@...are.org>, Linux-MM <linux-mm@...ck.org>,
	Seth Jennings <sjenning@...hat.com>,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] zsmalloc: use workqueue to destroy pool in zpool callback

On Thu, Mar 31, 2016 at 5:46 PM, Yu Zhao <yuzhao@...gle.com> wrote:
> On Thu, Mar 31, 2016 at 05:46:39PM +0900, Sergey Senozhatsky wrote:
>> On (03/30/16 08:59), Minchan Kim wrote:
>> > On Tue, Mar 29, 2016 at 03:02:57PM -0700, Yu Zhao wrote:
>> > > zs_destroy_pool() might sleep so it shouldn't be used in zpool
>> > > destroy callback which can be invoked in softirq context when
>> > > zsmalloc is configured to work with zswap.
>> >
>> > I think it's a limitation of zswap design, not zsmalloc.
>> > Could you handle it in zswap?
>>
>> agree. hm, looking at this backtrace
>>
>> >   [<ffffffffaea0224b>] mutex_lock+0x1b/0x2f
>> >   [<ffffffffaebca4f0>] kmem_cache_destroy+0x50/0x130
>> >   [<ffffffffaec10405>] zs_destroy_pool+0x85/0xe0
>> >   [<ffffffffaec1046e>] zs_zpool_destroy+0xe/0x10
>> >   [<ffffffffaec101a4>] zpool_destroy_pool+0x54/0x70
>> >   [<ffffffffaebedac2>] __zswap_pool_release+0x62/0x90
>> >   [<ffffffffaeb1037e>] rcu_process_callbacks+0x22e/0x640
>> >   [<ffffffffaeb15a3e>] ? run_timer_softirq+0x3e/0x280
>> >   [<ffffffffaeabe13b>] __do_softirq+0xcb/0x250
>> >   [<ffffffffaeabe4dc>] irq_exit+0x9c/0xb0
>> >   [<ffffffffaea03e7a>] smp_apic_timer_interrupt+0x6a/0x80
>> >   [<ffffffffaf0a394f>] apic_timer_interrupt+0x7f/0x90
>>
>> it also can hit the following path
>>
>>       rcu_process_callbacks()
>>               __zswap_pool_release()
>>                       zswap_pool_destroy()
>>                               zswap_cpu_comp_destroy()
>>                                       cpu_notifier_register_begin()
>>                                               mutex_lock(&cpu_add_remove_lock);  <<<
>>
>> can't it?
>>
>>       -ss
>
> Thanks, Sergey. Now I'm convinced the problem should be fixed in
> zswap. Since the rcu callback is already executed asynchronously,
> using workqueue to defer the callback further more doesn't seem
> to cause additional race condition at least.

certainly seems appropriate to fix it in zswap, I'll work on a patch
unless Seth or anyone else is already working on it.