[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ce1940c-f79e-4cb9-8f5f-d457ee0daa03@I-love.SAKURA.ne.jp>
Date: Sat, 28 Jun 2025 20:03:11 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: Jens Axboe <axboe@...nel.dk>, Yu Kuai <yukuai3@...wei.com>,
Christoph Hellwig <hch@....de>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] brd: fix sleeping memory allocation in brd_insert_page()
On 2025/06/28 18:39, Tetsuo Handa wrote:
> On 2025/06/28 17:36, Tetsuo Handa wrote:
>> syzbot is reporting that brd_insert_page() is calling
>> __xa_cmpxchg(__GFP_DIRECT_RECLAIM) with spinlock and RCU lock held.
>
> Hmm. Holding spinlock itself is OK because xa_lock() is a requirement.
>
>> Change __xa_cmpxchg() to use GFP_NOWAIT | __GFP_NOWARN, for it is likely
>> that __xa_cmpxchg() succeeds because of preceding alloc_page().
>
> Since this gfp flag is for allocating index array, it should use
> __GFP_DIRECT_RECLAIM if possible. Then, deferring RCU lock if possible
> makes sense. Then, I wonder what this RCU lock is protecting...
>
OK. I assume that the "concurrent discard" in
https://lkml.kernel.org/20250628011459.832760-1-yukuai1@huaweicloud.com means
brd_do_discard().
Calling rcu_read_lock() from brd_insert_page() before xa_unlock() is called prevents
__free_page() from brd_free_one_page() from call_rcu() from brd_do_discard(), even if
the page allocated by alloc_page() and stored into brd->brd_pages by __xa_cmpxchg() is
removed by __xa_erase() before brd_rw_bvec() calls memcpy_{to,from}_page()/memset();
allowing brd_rw_bvec() to continue using the page returned by brd_insert_page().
I came to worry one possibility about the above expectation, for I don't know
details of xarray.
__xa_cmpxchg() calls __xa_cmpxchg_raw() with xa_lock already held.
__xa_cmpxchg_raw() always calls __xas_nomem() with xa_lock already held.
__xas_nomem() might temporarily release xa_lock for allocating memory if
__GFP_DIRECT_RECLAIM is specified.
__xa_cmpxchg_raw() might store "entry" at xas_store() before calling __xas_nomem().
Then, is there a possibility that __xas_nomem() temporarily releases xa_lock for
allocating memory after__xa_cmpxchg_raw() already called xas_store() ?
Unless there is a guarantee that __xas_nomem() never releases xa_lock if
__xa_cmpxchg_raw() called xa_store(), there will be a race window that
the page allocated by alloc_page() and stored into brd->brd_pages by __xa_cmpxchg() is
removed by __xa_erase() from brd_do_discard() and __free_page() from brd_free_one_page()
from call_rcu() from brd_do_discard() is fired before brd_insert_page() calls rcu_lock()
immediately after returning from __xa_cmpxchg().
Also, what serializes concurrent brd_insert_page(), for when __xas_nomem() temporarily
released xa_lock for allocating memory, two threads might concurrently call
kmem_cache_alloc_lru() from __xas_nomem() ?
Powered by blists - more mailing lists