linux-kernel - Re: [PATCH] bcachefs: WARNING in bch2_trans_srcu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <e3ef2cea-4c39-b28f-d349-04e0ce8bb16f@macbook-pro-de-camila.local>
Date: Tue, 6 Aug 2024 23:04:39 -0400 (-04)
From: Camila Alvarez Inostroza <cam.alvarez.i@...il.com>
To: Kent Overstreet <kent.overstreet@...ux.dev>
cc: Camila Alvarez <cam.alvarez.i@...il.com>, 
    Brian Foster <bfoster@...hat.com>, linux-bcachefs@...r.kernel.org, 
    linux-kernel@...r.kernel.org, 
    syzbot+1e515cab343dbe5aa38a@...kaller.appspotmail.com
Subject: Re: [PATCH] bcachefs: WARNING in bch2_trans_srcu_unlock



On Fri, 2 Aug 2024, Kent Overstreet wrote:

> On Tue, Jul 23, 2024 at 09:08:00PM GMT, Camila Alvarez wrote:
>> bch2_extent_fallocate should not return -BCH_ERR_transaction_restart_nested when there are no buckets available.
>> Returning -BCH_ERR_transaction_restart_nested causes the sector allocation to retry even if there's no space available.
>
> It seems like you're saying this fixes a livelock, but I don't see that:
> if we return operation_blocked, the closure_sync we do before we return
> will still wait.
>
> More explanation?
>
Yes, sorry for the delay. The closure_sync will still wait, but even after 
waiting there's no space left.
I think the issue is that we're trying to allocate a bucket when none are 
available:
closure_wait(&c->freelist_wait, cl);

journal_write_done is called at some point:
bch2_do_discard(c)
closure_wake_up(&c->freelist_wait)
...

That apparently doesn't free any buckets (should we add some sort of 
assertion to notify us when this happens?), so there is still no space 
left, so bch2_bucket_alloc_trans keeps returning -BCH_ERR_freelist_empty.
With this bch2_alloc_sectors_start_trans return 
-BCH_err_bucket_alloc_blocked, which bch2_extent_fallocate turns into 
-BCH_ERR_transaction_restart_nested.

Finally __bchfs_fallocate sees the error as BCH_ERR_transaction_restart, 
so it tries again.
The issue is that the journal never discards any buckets so it just keeps 
retrying indefinitely.

I've been thinking, and maybe we don't want to just keep __bchfs_fallocate 
from retrying, but would prefer to control how many times it does.

Does all of this make any sense to you?


>>
>> Reported-by: syzbot+1e515cab343dbe5aa38a@...kaller.appspotmail.com
>> Signed-off-by: Camila Alvarez <cam.alvarez.i@...il.com>
>> ---
>>  fs/bcachefs/io_misc.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/fs/bcachefs/io_misc.c b/fs/bcachefs/io_misc.c
>> index 82f9170dab3f..b5b92ef030df 100644
>> --- a/fs/bcachefs/io_misc.c
>> +++ b/fs/bcachefs/io_misc.c
>> @@ -90,8 +90,6 @@ int bch2_extent_fallocate(struct btree_trans *trans,
>>  				opts.data_replicas,
>>  				opts.data_replicas,
>>  				BCH_WATERMARK_normal, 0, &cl, &wp);
>> -		if (bch2_err_matches(ret, BCH_ERR_operation_blocked))
>> -			ret = -BCH_ERR_transaction_restart_nested;
>>  		if (ret)
>>  			goto err;
>>
>> --
>> 2.34.1
>>
>