linux-kernel - Re: [GIT PULL] bcachefs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1e2134f1-f48b-1459-a38e-eac9597cd64a@kernel.dk>
Date:   Wed, 28 Jun 2023 17:14:09 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Kent Overstreet <kent.overstreet@...ux.dev>
Cc:     torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-bcachefs@...r.kernel.org,
        Christoph Hellwig <hch@....de>,
        Christian Brauner <brauner@...nel.org>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [GIT PULL] bcachefs

On 6/28/23 4:55?PM, Kent Overstreet wrote:
>> But it's not aio (or io_uring or whatever), it's simply the fact that
>> doing an fput() from an exiting task (for example) will end up being
>> done async. And hence waiting for task exits is NOT enough to ensure
>> that all file references have been released.
>>
>> Since there are a variety of other reasons why a mount may be pinned and
>> fail to umount, perhaps it's worth considering that changing this
>> behavior won't buy us that much. Especially since it's been around for
>> more than 10 years:
> 
> Because it seems that before io_uring the race was quite a bit harder to
> hit - I only started seeing it when things started switching over to
> io_uring. generic/388 used to pass reliably for me (pre backpointers),
> now it doesn't.

I literally just pasted a script that hits it in one second with aio. So
maybe generic/388 doesn't hit it as easily, but it's surely TRIVIAL to
hit with aio. As demonstrated. The io_uring is not hard to bring into
parity on that front, here's one I posted earlier today for 6.5:

https://lore.kernel.org/io-uring/20230628170953.952923-4-axboe@kernel.dk/

Doesn't change the fact that you can easily hit this with io_uring or
aio, and probably more things too (didn't look any further). Is it a
realistic thing outside of funky tests? Probably not really, or at least
if those guys hit it they'd probably have the work-around hack in place
in their script already.

But the fact is that it's been around for a decade. It's somehow a lot
easier to hit with bcachefs than XFS, which may just be because the
former has a bunch of workers and this may be deferring the delayed fput
work more. Just hand waving.

>> then we'd probably want to move that deferred fput list to the
>> task_struct and ensure that it gets run if the task exits rather than
>> have a global deferred list. Currently we have:
>>
>>
>> 1) If kthread or in interrupt
>> 	1a) add to global fput list
>> 2) task_work_add if not. If that fails, goto 1a.
>>
>> which would then become:
>>
>> 1) If kthread or in interrupt
>> 	1a) add to global fput list
>> 2) task_work_add if not. If that fails, we know task is existing. add to
>>    per-task defer list to be run at a convenient time before task has
>>    exited.
> 
> no, it becomes:
>  if we're running in a user task, or if we're doing an operation on
>  behalf of a user task, add to the user task's deferred list: otherwise
>  add to global deferred list.

And how would the "on behalf of a user task" work in terms of being
in_interrupt()?

-- 
Jens Axboe