[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230627201524.ool73bps2lre2tsz@moria.home.lan>
Date: Tue, 27 Jun 2023 16:15:24 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Jens Axboe <axboe@...nel.dk>
Cc: torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-bcachefs@...r.kernel.org,
Christoph Hellwig <hch@....de>
Subject: Re: [GIT PULL] bcachefs
On Tue, Jun 27, 2023 at 11:16:01AM -0600, Jens Axboe wrote:
> On 6/26/23 8:59?PM, Jens Axboe wrote:
> > On 6/26/23 8:05?PM, Kent Overstreet wrote:
> >> On Mon, Jun 26, 2023 at 07:13:54PM -0600, Jens Axboe wrote:
> >>> Doesn't reproduce for me with XFS. The above ktest doesn't work for me
> >>> either:
> >>
> >> It just popped for me on xfs, but it took half an hour or so of looping
> >> vs. 30 seconds on bcachefs.
> >
> > OK, I'll try and leave it running overnight and see if I can get it to
> > trigger.
>
> I did manage to reproduce it, and also managed to get bcachefs to run
> the test. But I had to add:
>
> diff --git a/check b/check
> index 5f9f1a6bec88..6d74bd4933bd 100755
> --- a/check
> +++ b/check
> @@ -283,7 +283,7 @@ while [ $# -gt 0 ]; do
> case "$1" in
> -\? | -h | --help) usage ;;
>
> - -nfs|-afs|-glusterfs|-cifs|-9p|-fuse|-virtiofs|-pvfs2|-tmpfs|-ubifs)
> + -nfs|-afs|-glusterfs|-cifs|-9p|-fuse|-virtiofs|-pvfs2|-tmpfs|-ubifs|-bcachefs)
> FSTYP="${1:1}"
> ;;
> -overlay)
I wonder if this is due to an upstream fstests change I haven't seen
yet, I'll have a look.
> to ktest/tests/xfstests/ and run it with -bcachefs, otherwise it kept
> failing because it assumed it was XFS.
>
> I suspected this was just a timing issue, and it looks like that's
> exactly what it is. Looking at the test case, it'll randomly kill -9
> fsstress, and if that happens while we have io_uring IO pending, then we
> process completions inline (for a PF_EXITING current). This means they
> get pushed to fallback work, which runs out of line. If we hit that case
> AND the timing is such that it hasn't been processed yet, we'll still be
> holding a file reference under the mount point and umount will -EBUSY
> fail.
>
> As far as I can tell, this can happen with aio as well, it's just harder
> to hit. If the fput happens while the task is exiting, then fput will
> end up being delayed through a workqueue as well. The test case assumes
> that once it's reaped the exit of the killed task that all files are
> released, which isn't necessarily true if they are done out-of-line.
Yeah, I traced it through to the delayed fput code as well.
I'm not sure delayed fput is responsible here; what I learned when I was
tracking this down has mostly fell out of my brain, so take anything I
say with a large grain of salt. But I believe I tested with delayed_fput
completely disabled, and found another thing in io_uring with the same
effect as delayed_fput that wasn't being flushed.
> For io_uring specifically, it may make sense to wait on the fallback
> work. The below patch does this, and should fix the issue. But I'm not
> fully convinced that this is really needed, as I do think this can
> happen without io_uring as well. It just doesn't right now as the test
> does buffered IO, and aio will be fully sync with buffered IO. That
> means there's either no gap where aio will hit it without O_DIRECT, or
> it's just small enough that it hasn't been hit.
I just tried your patch and I still have generic/388 failing - it
might've taken a bit longer to pop this time.
I wonder if there might be a better way of solving this though? For aio,
when a process is exiting we just synchronously tear down the ioctx,
including waiting for outstanding iocbs.
delayed_fput, even though I believe not responsible here, seems sketchy
to me because there doesn't seem to be a straightforward way to flush
delayed fputs for a given _process_ - there's a single global work item,
and we can only flush globally.
Would what aio does work here?
(disclaimer: I haven't studied the io_uring code so I haven't figured
out the approach your patch is taking yet)
Powered by blists - more mailing lists