[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALZtONCDqBjL9TFmUEwuHaNU3n55k0VwbYWqW-9dODuNWyzkLQ@mail.gmail.com>
Date: Thu, 31 Mar 2016 18:05:35 -0400
From: Dan Streetman <ddstreet@...e.org>
To: Yu Zhao <yuzhao@...gle.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Minchan Kim <minchan@...nel.org>,
Nitin Gupta <ngupta@...are.org>, Linux-MM <linux-mm@...ck.org>,
Seth Jennings <sjenning@...hat.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] zsmalloc: use workqueue to destroy pool in zpool callback
On Thu, Mar 31, 2016 at 5:46 PM, Yu Zhao <yuzhao@...gle.com> wrote:
> On Thu, Mar 31, 2016 at 05:46:39PM +0900, Sergey Senozhatsky wrote:
>> On (03/30/16 08:59), Minchan Kim wrote:
>> > On Tue, Mar 29, 2016 at 03:02:57PM -0700, Yu Zhao wrote:
>> > > zs_destroy_pool() might sleep so it shouldn't be used in zpool
>> > > destroy callback which can be invoked in softirq context when
>> > > zsmalloc is configured to work with zswap.
>> >
>> > I think it's a limitation of zswap design, not zsmalloc.
>> > Could you handle it in zswap?
>>
>> agree. hm, looking at this backtrace
>>
>> > [<ffffffffaea0224b>] mutex_lock+0x1b/0x2f
>> > [<ffffffffaebca4f0>] kmem_cache_destroy+0x50/0x130
>> > [<ffffffffaec10405>] zs_destroy_pool+0x85/0xe0
>> > [<ffffffffaec1046e>] zs_zpool_destroy+0xe/0x10
>> > [<ffffffffaec101a4>] zpool_destroy_pool+0x54/0x70
>> > [<ffffffffaebedac2>] __zswap_pool_release+0x62/0x90
>> > [<ffffffffaeb1037e>] rcu_process_callbacks+0x22e/0x640
>> > [<ffffffffaeb15a3e>] ? run_timer_softirq+0x3e/0x280
>> > [<ffffffffaeabe13b>] __do_softirq+0xcb/0x250
>> > [<ffffffffaeabe4dc>] irq_exit+0x9c/0xb0
>> > [<ffffffffaea03e7a>] smp_apic_timer_interrupt+0x6a/0x80
>> > [<ffffffffaf0a394f>] apic_timer_interrupt+0x7f/0x90
>>
>> it also can hit the following path
>>
>> rcu_process_callbacks()
>> __zswap_pool_release()
>> zswap_pool_destroy()
>> zswap_cpu_comp_destroy()
>> cpu_notifier_register_begin()
>> mutex_lock(&cpu_add_remove_lock); <<<
>>
>> can't it?
>>
>> -ss
>
> Thanks, Sergey. Now I'm convinced the problem should be fixed in
> zswap. Since the rcu callback is already executed asynchronously,
> using workqueue to defer the callback further more doesn't seem
> to cause additional race condition at least.
certainly seems appropriate to fix it in zswap, I'll work on a patch
unless Seth or anyone else is already working on it.
Powered by blists - more mailing lists